Today's Deep-Dive: ArchivesSpace

0:00

Welcome back to the Deep Dive. Today we're opening the vault on something that's

0:04

just

0:04

foundational for anyone in cultural heritage. It really is. We're taking a look at

0:08

ArchiveSpace,

0:09

that specialized application that, well, thousands of places use to manage

0:14

everything from archives

0:15

to manuscripts and now more and more digital collections. Right. And if you've ever

0:20

wondered

0:20

how, you know, a big university or historical society keeps track of literally

0:26

everything,

0:26

like a box of 19th century letters and a terabyte of photos. ArchiveSpace is

0:30

probably the answer.

0:31

Exactly. So, our mission today is pretty simple. We just want to give you a clear,

0:35

beginner-friendly way into this tool to understand what it does, why it even needs

0:39

to exist,

0:40

and how this really unique community model keeps it all going. Okay, let's unpack

0:45

this.

0:45

Sounds good. But first, a quick thank you to the supporter of this Deep Dive, Safe

0:49

Server.

0:50

Safe Server commits to hosting this software and supports you in your digital

0:55

transformation. You

0:56

can find more info at www.safeserver.de. Their support really does help us bring

1:02

you these kinds

1:03

of explorations. So, when we start talking about ArchiveSpace, I think the first

1:07

thing to get is

1:08

its origin. It really is the leading open-source tool for this, but it wasn't

1:13

designed in a

1:14

corporate boardroom somewhere. Right. It was literally built for archives by archivists.

1:19

Which feels important. It's everything. This tool exists because the off-the-shelf

1:24

stuff

1:24

for libraries or museums, it just couldn't handle the complexity of archives, the

1:28

way things are

1:29

arranged, the hierarchy. It's just different. So, it was a necessity. A total

1:33

necessity.

1:34

The first version, ArchiveSpace 1.0, came out in 2013, and it was this huge

1:39

collaboration with

1:41

places like NYU Libraries, UC San Diego, University of Illinois, all backed by the

1:46

Mellon Foundation,

1:47

and with organizational help from Lyracis. It was built to be stable from day one.

1:52

That phrase, built by archivists, that really sticks with me. Because you're right,

1:56

a library

1:56

might have thousands of individual books, one barcode each, but an archive, that's

2:01

a whole

2:02

different beast. It's millions of unique, connected things in one collection. So,

2:06

why does that need

2:08

its own software? What can ArchiveSpace do that, say, a really good database couldn't?

2:14

It really boils down to two things. Maintaining what we call intellectual control

2:18

and physical

2:19

control, and doing both at the same time. It's a single system that supports the

2:25

entire life cycle

2:28

of archival work. Everything from the moment an item arrives to the moment a

2:33

researcher finds it.

2:34

So, let's walk through that life cycle. For someone new to this, the terms can be a

2:37

bit

2:38

technical. Sure. So, there are five essential stages.

2:42

What's the first one? The first is accessioning.

2:44

That's just the intake. The moment a collection comes through the door,

2:47

you're recording what you got, who you got it from, the basic legal and procedural

2:51

stuff.

2:52

So, it's official record of arrival. Got it. What's next?

2:55

Second is arrangement. And this is critical. Archives aren't just random piles of

2:59

stuff.

3:00

They have an original order, the way the creator kept them.

3:02

Right. And that order has meaning. Exactly. So, the system lets you map out

3:06

that hierarchy digitally so the original context isn't lost.

3:09

Okay. That makes perfect sense. What's number three?

3:12

Third is description. This is where you create all that crucial metadata.

3:16

Basically, you're writing the finding aids. The roadmap for researchers.

3:19

Precisely. The guide that tells someone what's in box three, folder six.

3:23

An archive space helps generate those in standard formats, like EAD, so they work

3:29

everywhere.

3:29

So, it does the heavy lifting on those really complex documents.

3:32

It does. Then fourth, you have preservation. This part tracks the physical side of

3:38

things.

3:38

Where is it located? What are the environmental conditions? Does it need

3:41

conservation?

3:42

All to ensure its long-term health. And the last one.

3:45

And finally, number five is access. This is the payoff. It's the public interface

3:50

that lets people actually search and discover all this amazing material,

3:55

connecting all that backend work to the researcher.

3:57

Wow. Okay. When you lay it out like that, yeah, spreadsheet just isn't going to cut

4:00

it.

4:00

Not even close. It really is an essential piece of

4:03

digital infrastructure. And on the technical side, it feels just as solid. You can

4:07

tell it's built

4:08

to last. Yeah. I mean, we don't have to get lost in the code, but the fact that it's

4:11

built on a

4:12

mature language, like Ruby says a lot, this isn't some quick web app. It's a

4:16

serious platform meant

4:18

for the long haul. Which is what you need when you're managing history. And you see

4:23

that reflected

4:23

in the development activity. You can look at GitHub and see the numbers. 385 stars,

4:28

238 forks.

4:30

Which shows people are paying attention. Right. But the number that really tells

4:34

the story

4:34

is the 96 contributors. That's not just a couple of developers. That's a dedicated

4:39

community of

4:40

professionals putting in their own time and expertise. 96 people. That's a lot of

4:45

brain

4:45

power. And that kind of active dedication, that really brings us to the most

4:49

fascinating part of

4:50

this whole story. It really is. Because what's so interesting here is that Archive

4:53

Space isn't just

4:55

software you download. It's a community. In what way? I mean, it's an organized

5:00

body of archivists,

5:02

librarians, developers, administrators, all working together. It's community-supported

5:07

software.

5:08

The users aren't just customers. They're the owners. They decide where the software

5:11

goes next.

5:11

They fund it. They manage it. They implement it. That sounds amazing in theory, but

5:15

how does that work in practice? I mean, are archivists expected to learn how to

5:20

code in Ruby?

5:21

Where does that technical skill come from? That's a great question. No, it's not

5:25

all on the

5:26

archivists. The development work tends to come from three main places. You have

5:30

developers at member institutions, developers from vendor partners who host it,

5:34

like Safe Server,

5:35

and then the core program team, which the community's membership fees actually fund.

5:40

So there's a professional core. A professional core guided by the community. And

5:43

you see that

5:44

commitment all the time. Like they just announced Martha Tenney is joining as the

5:48

new standards and

5:49

testing archivist. They're making sure everything stays up to professional

5:52

standards. They're even

5:53

planning their 2026 virtual member forum already. This is a very active, very

5:58

organized ecosystem.

5:59

That active engagement must be what makes it work so well in the real world. And

6:04

here's where it gets

6:05

really interesting when you hear from the people actually using it every day. Like

6:09

the testimonials

6:10

tell the whole story. I was reading what Tessa Wakefield from the University of

6:14

Northern Iowa

6:14

said. She mentioned that it gives her staff more autonomy. They can manage things

6:18

more effectively

6:20

because they aren't waiting on some outside company. They have a say in the tools

6:23

they use.

6:24

Exactly. They can help build the solution they need. And it's not just the archivists.

6:28

What about

6:29

the IT staff? Right. They're off of the forgotten piece of the puzzle. Totally. But

6:33

Tom McNeely,

6:34

he's IT at Western Washington University, he said it was pretty easy to install and

6:40

upgrade. And he

6:41

praised the technical documentation. When the IT team is happy, that saves everyone

6:46

time and money.

6:47

That's a huge win. Good documentation is priceless. It really is. But let's bring

6:51

it

6:51

back to the public, to the researchers. The impact on discovery is just massive.

6:56

Heidi Pettit at

6:57

Lawrence College talked about going from over a hundred separate finding aids to a

7:03

single

7:04

searchable collection in archive space. Can you imagine trying to do research by

7:09

searching a

7:10

hundred different PDFs one by one? That sounds like a nightmare. A unified system

7:15

is a complete

7:16

game changer for research. It's a revolutionary leap. And that all comes back to

7:20

that community

7:21

governance. I think Bre McLaughlin at Indiana University put it perfectly. She said

7:25

she

7:25

appreciates that feedback and concerns are actually heard. That feeling of having a

7:30

real voice is so

7:31

rare with enterprise software. Usually you just pay your fee and hope for the best.

7:35

That's right.

7:35

You're a customer, not a partner. Which brings us to the question that can be a

7:39

little confusing

7:40

for newcomers. If the software is open source, you know, free to download and use,

7:45

then why is

7:47

there a membership model? It feels like a paradox. It's a great point and it's the

7:51

absolute key to

7:52

its survival. The code is free, yes, but running a program team, offering

7:56

professional support,

7:58

coordinating all that development, that takes real money. So the membership model

8:03

is for

8:03

sustainability. Exactly. It's a collective fund to ensure the tool not only

8:07

survives

8:08

but continues to evolve for the good of the whole field. So it's less like buying a

8:12

product and more

8:12

like, I don't know, supporting public radio. You're funding a shared resource. That's

8:17

a

8:17

perfect analogy. And members get concrete benefits for it. Like what? Well, there

8:21

are tangible

8:22

things like getting technical support directly from the program team and access to

8:27

the user manual.

8:27

But the intangible benefit is the big one. Having a real voice in the future of the

8:34

software.

8:34

And a seat at the table. A seat at the table. Plus, by investing in this shared

8:38

infrastructure,

8:39

institutions protect themselves from being locked in by a single commercial vendor

8:44

who could suddenly change the rules or raise prices. It keeps the power in the

8:48

hands of the

8:49

users. It's like an insurance policy and a way to drive standardization all at once.

8:53

I also noticed

8:54

how they structured the fees. Yeah, the tiered levels are important. Very. They

8:58

have five

8:58

different membership levels, so a huge university and a small local museum can both

9:03

participate

9:04

and have their voices heard. It spreads the cost fairly across everyone who

9:07

benefits. It makes it

9:08

accessible for everyone. It really does. So what we have is this hyper specialized

9:13

tool that thrives

9:15

on shared investment and, well, dagnetic governance. And it's clearly working. The

9:20

latest release,

9:21

V4.1.1, just came out on July 1st, 2025. It's just a fantastic model for building

9:27

critical

9:28

infrastructure. It really is a key example of how open source, when you back it

9:31

with a smart funding

9:32

and community model, can lead to real innovation and standardization. Without being

9:37

driven by profit.

9:38

Exactly. And think of the efficiency. Tom Adams from Cold Spring Harbor Laboratory

9:43

pointed out

9:43

that because so many institutions use it, they can leverage plugins and tools

9:47

others have built.

9:48

They don't have to spend a ton of money on in-house development.

9:50

That shared effort saves everyone time and money. It lifts the whole sector.

9:55

It really does.

9:56

That is just a tremendously powerful model. So as we wrap up this deep dive, here's

10:00

a final

10:01

thought for you to consider. How does this model-free software sustain by a

10:05

professional

10:05

membership compared to the other essential digital tools you use every day, the

10:10

ones built by massive

10:11

companies? Does the archive space model maybe ensure a better, more focused

10:16

response to the

10:17

actual needs of its users, the archivists? It's something to mull over.

10:22

Definitely something to think about.

10:23

And that wraps up our deep dive. Thank you again to Safe Server for supporting this

10:27

exploration.

10:28

Remember, Safe Server takes care of the hosting of this software and supports you

10:31

in your digital

10:32

time for the next deep dive.

10:32

time for the next deep dive.

Today's Deep-Dive: ArchivesSpace

Episode description

Persons