1 00:00:00,000 --> 00:00:04,640 Welcome back to the Deep Dive. Today we're opening the vault on something that's 2 00:00:04,640 --> 00:00:04,960 just 3 00:00:04,960 --> 00:00:08,970 foundational for anyone in cultural heritage. It really is. We're taking a look at 4 00:00:08,970 --> 00:00:09,920 ArchiveSpace, 5 00:00:09,920 --> 00:00:14,560 that specialized application that, well, thousands of places use to manage 6 00:00:14,560 --> 00:00:15,760 everything from archives 7 00:00:15,760 --> 00:00:20,510 to manuscripts and now more and more digital collections. Right. And if you've ever 8 00:00:20,510 --> 00:00:20,960 wondered 9 00:00:20,960 --> 00:00:26,060 how, you know, a big university or historical society keeps track of literally 10 00:00:26,060 --> 00:00:26,640 everything, 11 00:00:26,640 --> 00:00:30,720 like a box of 19th century letters and a terabyte of photos. ArchiveSpace is 12 00:00:30,720 --> 00:00:31,520 probably the answer. 13 00:00:31,520 --> 00:00:35,520 Exactly. So, our mission today is pretty simple. We just want to give you a clear, 14 00:00:35,520 --> 00:00:39,920 beginner-friendly way into this tool to understand what it does, why it even needs 15 00:00:39,920 --> 00:00:40,720 to exist, 16 00:00:40,720 --> 00:00:45,420 and how this really unique community model keeps it all going. Okay, let's unpack 17 00:00:45,420 --> 00:00:45,760 this. 18 00:00:45,760 --> 00:00:49,620 Sounds good. But first, a quick thank you to the supporter of this Deep Dive, Safe 19 00:00:49,620 --> 00:00:50,400 Server. 20 00:00:50,400 --> 00:00:55,010 Safe Server commits to hosting this software and supports you in your digital 21 00:00:55,010 --> 00:00:56,400 transformation. You 22 00:00:56,400 --> 00:01:02,800 can find more info at www.safeserver.de. Their support really does help us bring 23 00:01:02,800 --> 00:01:03,280 you these kinds 24 00:01:03,280 --> 00:01:07,630 of explorations. So, when we start talking about ArchiveSpace, I think the first 25 00:01:07,630 --> 00:01:08,640 thing to get is 26 00:01:08,640 --> 00:01:13,410 its origin. It really is the leading open-source tool for this, but it wasn't 27 00:01:13,410 --> 00:01:14,320 designed in a 28 00:01:14,320 --> 00:01:19,440 corporate boardroom somewhere. Right. It was literally built for archives by archivists. 29 00:01:19,440 --> 00:01:24,000 Which feels important. It's everything. This tool exists because the off-the-shelf 30 00:01:24,000 --> 00:01:24,320 stuff 31 00:01:24,320 --> 00:01:28,670 for libraries or museums, it just couldn't handle the complexity of archives, the 32 00:01:28,670 --> 00:01:29,200 way things are 33 00:01:29,200 --> 00:01:33,750 arranged, the hierarchy. It's just different. So, it was a necessity. A total 34 00:01:33,750 --> 00:01:34,560 necessity. 35 00:01:34,560 --> 00:01:39,660 The first version, ArchiveSpace 1.0, came out in 2013, and it was this huge 36 00:01:39,660 --> 00:01:41,040 collaboration with 37 00:01:41,040 --> 00:01:46,320 places like NYU Libraries, UC San Diego, University of Illinois, all backed by the 38 00:01:46,320 --> 00:01:47,520 Mellon Foundation, 39 00:01:47,520 --> 00:01:52,000 and with organizational help from Lyracis. It was built to be stable from day one. 40 00:01:52,000 --> 00:01:56,190 That phrase, built by archivists, that really sticks with me. Because you're right, 41 00:01:56,190 --> 00:01:56,720 a library 42 00:01:56,720 --> 00:02:01,820 might have thousands of individual books, one barcode each, but an archive, that's 43 00:02:01,820 --> 00:02:02,080 a whole 44 00:02:02,080 --> 00:02:06,880 different beast. It's millions of unique, connected things in one collection. So, 45 00:02:06,880 --> 00:02:08,000 why does that need 46 00:02:08,000 --> 00:02:14,320 its own software? What can ArchiveSpace do that, say, a really good database couldn't? 47 00:02:14,320 --> 00:02:18,800 It really boils down to two things. Maintaining what we call intellectual control 48 00:02:18,800 --> 00:02:19,440 and physical 49 00:02:19,440 --> 00:02:25,680 control, and doing both at the same time. It's a single system that supports the 50 00:02:25,680 --> 00:02:28,720 entire life cycle 51 00:02:28,720 --> 00:02:33,270 of archival work. Everything from the moment an item arrives to the moment a 52 00:02:33,270 --> 00:02:34,560 researcher finds it. 53 00:02:34,560 --> 00:02:37,920 So, let's walk through that life cycle. For someone new to this, the terms can be a 54 00:02:37,920 --> 00:02:38,160 bit 55 00:02:38,160 --> 00:02:42,080 technical. Sure. So, there are five essential stages. 56 00:02:42,080 --> 00:02:44,160 What's the first one? The first is accessioning. 57 00:02:44,160 --> 00:02:47,600 That's just the intake. The moment a collection comes through the door, 58 00:02:47,600 --> 00:02:51,610 you're recording what you got, who you got it from, the basic legal and procedural 59 00:02:51,610 --> 00:02:52,240 stuff. 60 00:02:52,240 --> 00:02:55,040 So, it's official record of arrival. Got it. What's next? 61 00:02:55,040 --> 00:02:59,650 Second is arrangement. And this is critical. Archives aren't just random piles of 62 00:02:59,650 --> 00:03:00,080 stuff. 63 00:03:00,080 --> 00:03:02,880 They have an original order, the way the creator kept them. 64 00:03:02,880 --> 00:03:06,640 Right. And that order has meaning. Exactly. So, the system lets you map out 65 00:03:06,640 --> 00:03:09,840 that hierarchy digitally so the original context isn't lost. 66 00:03:09,840 --> 00:03:12,320 Okay. That makes perfect sense. What's number three? 67 00:03:12,320 --> 00:03:16,400 Third is description. This is where you create all that crucial metadata. 68 00:03:16,400 --> 00:03:19,840 Basically, you're writing the finding aids. The roadmap for researchers. 69 00:03:19,840 --> 00:03:23,760 Precisely. The guide that tells someone what's in box three, folder six. 70 00:03:23,760 --> 00:03:29,520 An archive space helps generate those in standard formats, like EAD, so they work 71 00:03:29,520 --> 00:03:29,920 everywhere. 72 00:03:29,920 --> 00:03:32,960 So, it does the heavy lifting on those really complex documents. 73 00:03:32,960 --> 00:03:38,110 It does. Then fourth, you have preservation. This part tracks the physical side of 74 00:03:38,110 --> 00:03:38,720 things. 75 00:03:38,720 --> 00:03:41,810 Where is it located? What are the environmental conditions? Does it need 76 00:03:41,810 --> 00:03:42,880 conservation? 77 00:03:42,880 --> 00:03:45,920 All to ensure its long-term health. And the last one. 78 00:03:45,920 --> 00:03:50,880 And finally, number five is access. This is the payoff. It's the public interface 79 00:03:50,880 --> 00:03:55,040 that lets people actually search and discover all this amazing material, 80 00:03:55,040 --> 00:03:57,360 connecting all that backend work to the researcher. 81 00:03:57,360 --> 00:04:00,480 Wow. Okay. When you lay it out like that, yeah, spreadsheet just isn't going to cut 82 00:04:00,480 --> 00:04:00,720 it. 83 00:04:00,720 --> 00:04:03,040 Not even close. It really is an essential piece of 84 00:04:03,040 --> 00:04:07,440 digital infrastructure. And on the technical side, it feels just as solid. You can 85 00:04:07,440 --> 00:04:08,000 tell it's built 86 00:04:08,000 --> 00:04:11,920 to last. Yeah. I mean, we don't have to get lost in the code, but the fact that it's 87 00:04:11,920 --> 00:04:12,320 built on a 88 00:04:12,320 --> 00:04:16,740 mature language, like Ruby says a lot, this isn't some quick web app. It's a 89 00:04:16,740 --> 00:04:18,400 serious platform meant 90 00:04:18,400 --> 00:04:23,120 for the long haul. Which is what you need when you're managing history. And you see 91 00:04:23,120 --> 00:04:23,680 that reflected 92 00:04:23,680 --> 00:04:28,780 in the development activity. You can look at GitHub and see the numbers. 385 stars, 93 00:04:28,780 --> 00:04:30,240 238 forks. 94 00:04:30,240 --> 00:04:34,080 Which shows people are paying attention. Right. But the number that really tells 95 00:04:34,080 --> 00:04:34,720 the story 96 00:04:34,720 --> 00:04:39,360 is the 96 contributors. That's not just a couple of developers. That's a dedicated 97 00:04:39,360 --> 00:04:40,080 community of 98 00:04:40,080 --> 00:04:45,180 professionals putting in their own time and expertise. 96 people. That's a lot of 99 00:04:45,180 --> 00:04:45,360 brain 100 00:04:45,360 --> 00:04:49,060 power. And that kind of active dedication, that really brings us to the most 101 00:04:49,060 --> 00:04:50,080 fascinating part of 102 00:04:50,080 --> 00:04:53,850 this whole story. It really is. Because what's so interesting here is that Archive 103 00:04:53,850 --> 00:04:55,360 Space isn't just 104 00:04:55,360 --> 00:05:00,880 software you download. It's a community. In what way? I mean, it's an organized 105 00:05:00,880 --> 00:05:02,400 body of archivists, 106 00:05:02,400 --> 00:05:07,600 librarians, developers, administrators, all working together. It's community-supported 107 00:05:07,600 --> 00:05:08,480 software. 108 00:05:08,480 --> 00:05:11,280 The users aren't just customers. They're the owners. They decide where the software 109 00:05:11,280 --> 00:05:11,760 goes next. 110 00:05:11,760 --> 00:05:15,760 They fund it. They manage it. They implement it. That sounds amazing in theory, but 111 00:05:15,760 --> 00:05:20,880 how does that work in practice? I mean, are archivists expected to learn how to 112 00:05:20,880 --> 00:05:21,840 code in Ruby? 113 00:05:21,840 --> 00:05:25,760 Where does that technical skill come from? That's a great question. No, it's not 114 00:05:25,760 --> 00:05:26,160 all on the 115 00:05:26,160 --> 00:05:30,720 archivists. The development work tends to come from three main places. You have 116 00:05:30,720 --> 00:05:34,560 developers at member institutions, developers from vendor partners who host it, 117 00:05:34,560 --> 00:05:35,840 like Safe Server, 118 00:05:35,840 --> 00:05:40,320 and then the core program team, which the community's membership fees actually fund. 119 00:05:40,320 --> 00:05:43,780 So there's a professional core. A professional core guided by the community. And 120 00:05:43,780 --> 00:05:44,400 you see that 121 00:05:44,400 --> 00:05:48,320 commitment all the time. Like they just announced Martha Tenney is joining as the 122 00:05:48,320 --> 00:05:49,520 new standards and 123 00:05:49,520 --> 00:05:52,850 testing archivist. They're making sure everything stays up to professional 124 00:05:52,850 --> 00:05:53,920 standards. They're even 125 00:05:53,920 --> 00:05:58,630 planning their 2026 virtual member forum already. This is a very active, very 126 00:05:58,630 --> 00:05:59,840 organized ecosystem. 127 00:05:59,840 --> 00:06:04,560 That active engagement must be what makes it work so well in the real world. And 128 00:06:04,560 --> 00:06:05,280 here's where it gets 129 00:06:05,280 --> 00:06:09,800 really interesting when you hear from the people actually using it every day. Like 130 00:06:09,800 --> 00:06:10,560 the testimonials 131 00:06:10,560 --> 00:06:14,080 tell the whole story. I was reading what Tessa Wakefield from the University of 132 00:06:14,080 --> 00:06:14,800 Northern Iowa 133 00:06:14,800 --> 00:06:18,990 said. She mentioned that it gives her staff more autonomy. They can manage things 134 00:06:18,990 --> 00:06:20,400 more effectively 135 00:06:20,400 --> 00:06:23,990 because they aren't waiting on some outside company. They have a say in the tools 136 00:06:23,990 --> 00:06:24,560 they use. 137 00:06:24,560 --> 00:06:28,960 Exactly. They can help build the solution they need. And it's not just the archivists. 138 00:06:28,960 --> 00:06:29,440 What about 139 00:06:29,440 --> 00:06:33,850 the IT staff? Right. They're off of the forgotten piece of the puzzle. Totally. But 140 00:06:33,850 --> 00:06:34,720 Tom McNeely, 141 00:06:34,720 --> 00:06:40,160 he's IT at Western Washington University, he said it was pretty easy to install and 142 00:06:40,160 --> 00:06:41,280 upgrade. And he 143 00:06:41,280 --> 00:06:46,380 praised the technical documentation. When the IT team is happy, that saves everyone 144 00:06:46,380 --> 00:06:47,200 time and money. 145 00:06:47,200 --> 00:06:51,680 That's a huge win. Good documentation is priceless. It really is. But let's bring 146 00:06:51,680 --> 00:06:51,760 it 147 00:06:51,760 --> 00:06:56,320 back to the public, to the researchers. The impact on discovery is just massive. 148 00:06:56,320 --> 00:06:57,280 Heidi Pettit at 149 00:06:57,280 --> 00:07:03,860 Lawrence College talked about going from over a hundred separate finding aids to a 150 00:07:03,860 --> 00:07:04,640 single 151 00:07:04,640 --> 00:07:09,440 searchable collection in archive space. Can you imagine trying to do research by 152 00:07:09,440 --> 00:07:10,240 searching a 153 00:07:10,240 --> 00:07:15,340 hundred different PDFs one by one? That sounds like a nightmare. A unified system 154 00:07:15,340 --> 00:07:16,160 is a complete 155 00:07:16,160 --> 00:07:20,350 game changer for research. It's a revolutionary leap. And that all comes back to 156 00:07:20,350 --> 00:07:21,040 that community 157 00:07:21,040 --> 00:07:25,230 governance. I think Bre McLaughlin at Indiana University put it perfectly. She said 158 00:07:25,230 --> 00:07:25,360 she 159 00:07:25,360 --> 00:07:30,460 appreciates that feedback and concerns are actually heard. That feeling of having a 160 00:07:30,460 --> 00:07:31,440 real voice is so 161 00:07:31,440 --> 00:07:35,360 rare with enterprise software. Usually you just pay your fee and hope for the best. 162 00:07:35,360 --> 00:07:35,920 That's right. 163 00:07:35,920 --> 00:07:39,520 You're a customer, not a partner. Which brings us to the question that can be a 164 00:07:39,520 --> 00:07:40,240 little confusing 165 00:07:40,240 --> 00:07:45,440 for newcomers. If the software is open source, you know, free to download and use, 166 00:07:45,440 --> 00:07:47,120 then why is 167 00:07:47,120 --> 00:07:51,200 there a membership model? It feels like a paradox. It's a great point and it's the 168 00:07:51,200 --> 00:07:52,160 absolute key to 169 00:07:52,160 --> 00:07:56,850 its survival. The code is free, yes, but running a program team, offering 170 00:07:56,850 --> 00:07:58,240 professional support, 171 00:07:58,240 --> 00:08:03,040 coordinating all that development, that takes real money. So the membership model 172 00:08:03,040 --> 00:08:03,360 is for 173 00:08:03,360 --> 00:08:07,780 sustainability. Exactly. It's a collective fund to ensure the tool not only 174 00:08:07,780 --> 00:08:08,400 survives 175 00:08:08,400 --> 00:08:12,150 but continues to evolve for the good of the whole field. So it's less like buying a 176 00:08:12,150 --> 00:08:12,960 product and more 177 00:08:12,960 --> 00:08:17,360 like, I don't know, supporting public radio. You're funding a shared resource. That's 178 00:08:17,360 --> 00:08:17,440 a 179 00:08:17,440 --> 00:08:21,830 perfect analogy. And members get concrete benefits for it. Like what? Well, there 180 00:08:21,830 --> 00:08:22,560 are tangible 181 00:08:22,560 --> 00:08:27,000 things like getting technical support directly from the program team and access to 182 00:08:27,000 --> 00:08:27,920 the user manual. 183 00:08:27,920 --> 00:08:34,080 But the intangible benefit is the big one. Having a real voice in the future of the 184 00:08:34,080 --> 00:08:34,800 software. 185 00:08:34,800 --> 00:08:38,990 And a seat at the table. A seat at the table. Plus, by investing in this shared 186 00:08:38,990 --> 00:08:39,920 infrastructure, 187 00:08:39,920 --> 00:08:44,160 institutions protect themselves from being locked in by a single commercial vendor 188 00:08:44,160 --> 00:08:48,720 who could suddenly change the rules or raise prices. It keeps the power in the 189 00:08:48,720 --> 00:08:49,280 hands of the 190 00:08:49,280 --> 00:08:53,840 users. It's like an insurance policy and a way to drive standardization all at once. 191 00:08:53,840 --> 00:08:54,560 I also noticed 192 00:08:54,560 --> 00:08:58,240 how they structured the fees. Yeah, the tiered levels are important. Very. They 193 00:08:58,240 --> 00:08:58,720 have five 194 00:08:58,720 --> 00:09:03,230 different membership levels, so a huge university and a small local museum can both 195 00:09:03,230 --> 00:09:04,000 participate 196 00:09:04,000 --> 00:09:07,740 and have their voices heard. It spreads the cost fairly across everyone who 197 00:09:07,740 --> 00:09:08,880 benefits. It makes it 198 00:09:08,880 --> 00:09:13,990 accessible for everyone. It really does. So what we have is this hyper specialized 199 00:09:13,990 --> 00:09:15,280 tool that thrives 200 00:09:15,280 --> 00:09:20,640 on shared investment and, well, dagnetic governance. And it's clearly working. The 201 00:09:20,640 --> 00:09:21,360 latest release, 202 00:09:21,360 --> 00:09:27,870 V4.1.1, just came out on July 1st, 2025. It's just a fantastic model for building 203 00:09:27,870 --> 00:09:28,240 critical 204 00:09:28,240 --> 00:09:31,780 infrastructure. It really is a key example of how open source, when you back it 205 00:09:31,780 --> 00:09:32,960 with a smart funding 206 00:09:32,960 --> 00:09:37,230 and community model, can lead to real innovation and standardization. Without being 207 00:09:37,230 --> 00:09:38,320 driven by profit. 208 00:09:38,320 --> 00:09:43,420 Exactly. And think of the efficiency. Tom Adams from Cold Spring Harbor Laboratory 209 00:09:43,420 --> 00:09:43,600 pointed out 210 00:09:43,600 --> 00:09:47,370 that because so many institutions use it, they can leverage plugins and tools 211 00:09:47,370 --> 00:09:48,480 others have built. 212 00:09:48,480 --> 00:09:50,880 They don't have to spend a ton of money on in-house development. 213 00:09:50,880 --> 00:09:55,360 That shared effort saves everyone time and money. It lifts the whole sector. 214 00:09:55,360 --> 00:09:56,080 It really does. 215 00:09:56,080 --> 00:10:00,950 That is just a tremendously powerful model. So as we wrap up this deep dive, here's 216 00:10:00,950 --> 00:10:01,360 a final 217 00:10:01,360 --> 00:10:05,100 thought for you to consider. How does this model-free software sustain by a 218 00:10:05,100 --> 00:10:05,920 professional 219 00:10:05,920 --> 00:10:10,650 membership compared to the other essential digital tools you use every day, the 220 00:10:10,650 --> 00:10:11,760 ones built by massive 221 00:10:11,760 --> 00:10:16,860 companies? Does the archive space model maybe ensure a better, more focused 222 00:10:16,860 --> 00:10:17,920 response to the 223 00:10:17,920 --> 00:10:22,400 actual needs of its users, the archivists? It's something to mull over. 224 00:10:22,400 --> 00:10:23,760 Definitely something to think about. 225 00:10:23,760 --> 00:10:27,680 And that wraps up our deep dive. Thank you again to Safe Server for supporting this 226 00:10:27,680 --> 00:10:28,560 exploration. 227 00:10:28,560 --> 00:10:31,880 Remember, Safe Server takes care of the hosting of this software and supports you 228 00:10:31,880 --> 00:10:32,480 in your digital 229 00:10:32,480 --> 00:10:38,350 transformation. You can find out more at www.safeserver.de. We'll catch you next 230 00:10:38,350 --> 00:10:39,840 time for the next deep dive.