have you ever like, uh, found something really cool online, you know,
like a super insightful article or something. And then you're like, Oh man,
I wish I could save this, you know,
cause it feels like it's going to disappear tomorrow. Yeah. Yeah.
It's tough, right? I mean, so much is out there.
It is. And it's like, you can't just grab it. You know what I mean?
It was like trying to hold on to smoke or something. But, uh, today,
we're going to look at ways that we can actually kind of save these bits of the
internet that are important to us. Oh, and by the way,
big thanks to safe server for making this whole deep dive possible.
They're great. They are safe.
Server takes care of hosting for tools like this and also offer some awesome
advice on digital transformation.
You can find out more at www.safeserver.dg.
Yeah. And you know that feeling you mentioned about content just vanishing.
That's something a lot of people struggle with.
What we're talking about today is we'll think of it as building your own digital
arc right for all that information. So it doesn't just disappear.
It's the perfect metaphor and the tool we're focusing on today is called archive
box. Catchy name. It is right now.
It might sound kind of like, I don't know,
something someone would use in their basement, super techie.
And this whole idea of self-hosting might sound a bit intimidating, but trust me,
the basic idea is really simple and super helpful, especially if you are,
you know, someone who wants to learn, stay informed,
all that without getting lost in like the endless sea of the internet.
It can be overwhelming. Oh, totally. I mean,
think of archive box, like your own personal library for web
pages. That's a good way to put it. It's like that save as button, you know,
but for the whole internet. And the best part is you control it.
You're in charge. Exactly. And it's what's known as open source.
That means it's free and people are constantly improving it.
Like a community effort. Cool. It is. And the self-hosted part,
that just means you decide where it all lives, your computer, a home server,
wherever you own your stuff.
So instead of just bookmarking something and hoping for the best archive box
actually makes a copy. Copies actually, sometimes a bunch of different kinds.
Oh wow. Yeah. It's like super safe keeping for the stuff you care about.
So how is this different from say archive.org?
They're saving tons of web pages already. Sure.
Think of archive.org like a huge public library open to everyone.
Archive box is more like your own personal collection.
You can save public stuff. Sure. But also private stuff,
things behind a login or just things you want to keep to yourself.
Makes sense. So it's not just the big public stuff.
It's the things I find valuable, even if they're niche or personal.
Now you keep mentioning open source. What's the big deal with that for,
you know, regular people? Okay.
So imagine you have a recipe and it's open source.
Anyone can look at it, use it, change it. Even that's archive box.
It's constantly evolving because a community of developers are working on it.
You get to use all that for free. So it's like always getting better. Yeah. Okay.
So we've got a good idea of what archive box is, but why is this good for you?
You know, our listeners, you guys are all about learning,
getting the info quickly, but thoroughly seeing different sides of things and
having those aha moments. How does archive box help with all that?
It's all about efficiency. Totally. It's like,
instead of relying on like scattered bookmarks,
you're actually saving the source. You're curating your own library of knowledge.
Okay. Give me some examples.
What could someone actually save using archive box?
So many things, research papers, code examples.
Say you find a really good thread on social media packed with insights.
Archive box can grab that or you're researching something and you find a bunch
of articles, you know, you'll want to reference later archive box.
So it's not just like log posts. Nope. It's super versatile.
You can give it specific web addresses. You can import your browser bookmarks,
even stuff you've saved in apps like pocket.
If you follow people on social media or use RSS feeds,
it can even automatically save new content from those pretty neat, right?
Very cool. So that's the quickly gained knowledge part, right?
You've got it all in one place,
but how do you avoid being overwhelmed by like a mountain of digital stuff?
Here's where it gets really clever. Archive box doesn't just save a link.
It goes into the webpage and it pulls out the content.
So it saves the text, downloads videos and images, even grabs code.
So when you revisit something, the key stuff is right there.
You can even search through your archive by keyword. It's powerful. Wow.
So no more endless scrolling,
trying to remember where you saw that one thing. Exactly. And it gets better.
To make sure your stuff doesn't become unusable later on.
It saves everything in standard format. So you've got HTML pages, PDFs, images,
video files, plain text files, even special archive files called WRC.
Those are like time capsules for a webpage.
You can open all this stuff with tons different programs,
even if you're not using archive box anymore.
So this is all about building like a personal, well organized,
future proof library of the internet. You got it. Okay. Let's get a little
technical.
How does archive box actually do all this like step by step? Okay.
So you give archive box a URL. It takes a snapshot of that page,
but not just a photo.
It saves it in all those different formats we talked about just in case one
format becomes obsolete down the road, like multiple backups.
So like extra careful. Totally. So it grabs the website code, the HTML,
the CSS, sometimes even the JavaScript creates a single HTML file for offline
viewing. Super handy. Then a screenshot for visuals, a PDF,
good for reading. And those war are C files,
bundle everything together and for articles,
it can strip away everything except the text. So it's really clean.
If there's video or audio, it downloads those files.
If it's a software project like on GitHub, it can save the code itself.
Wow. That's a lot. So in everyday use, do I need to be like a coding expert?
Not at all. One of the coolest things is there are options for everyone.
If you like the command line, you can use it that way.
But there's also a web interface, like a website you access in your browser,
click buttons, manager stuff, easy peasy. And if you're a developer,
you can use it within other software. Cool. So there's a user friendly way,
even if you're not into coding. Absolutely.
And to make things even easier, there's a scheduling feature.
You can set it to check your favorite sites, social media feeds, RSS fees,
whatever automatically at set times. It grabs new content.
So you don't have to, that's a game changer.
And you mentioned that it uses familiar tools. Yeah. Under the hood,
it's using things like Chrome, W get for downloading,
YT DLP for videos.
And remember your data is stored in regular files and folders, nothing fancy,
nothing locked away. That makes me feel better. Okay.
So how do we actually get started with archive box? What's the easiest way?
So installing software can be scary, but they've made it really easy.
One of the simplest ways is using something called Docker Docker.
Think of it as a package. It has archive box plus everything else.
It needs to run all in one. You don't have to install a bunch of separate things.
Docker takes care of it. Like a pre-assembled kit. Precisely.
The archive box docs have a quick start guide. They recommend Docker compose.
That's how you manage these Docker packages.
Easiest way to get going on pretty much any computer. Mac, Windows, Linux,
you name it. You download a file, run a couple of commands. That's it.
The docs tell you exactly what to do.
That sounds doable. And once it's set up, how do I save a webpage?
Easy. Once archive box is running, you just type a single command,
include the website address and that's it. Again,
the docs show you exactly how to do it.
What about that web interface? Once archive box is going,
you can usually get to it by typing a specific address into your web browser,
something like HTTP dot one two seven dot zero dot one point eight zero zero
zero usually. And that's it. You'll see the archive box interface,
add new URLs, browse your archive, change settings, all with your mouse. Nice.
So it really does offer like power and simplicity at the same time.
And then we can use it. So to wrap things up for our listeners,
it sounds like archive box is a really amazing way to like create your own
personally controlled archive of internet stuff. That's important to you.
And it's built to last. Yeah. It's about giving you the power you curate,
you preserve, and you do it on your terms.
And even though it might seem a little techie at first,
the main goal is simple,
making sure you can access the stuff you need for years to come.
If you're curious or you want to try it out, check out the archive box website.
It's archive box.io.
They're also on GitHub where all the code is available.
Makes you wonder, doesn't it?
What if you had a reliable archive of like all the important info you found
online? What could you do with that?
That's an interesting thought, something to think about.
And a huge thanks to Safe Server once again for supporting this deep dive.
They provide hosting for software like this,
and they also offer some really cool consulting on all kinds of digital
They're awesome.
They're awesome.