‹ go back
I’ve been thinking about mortality and longevity recently. Today, I’d like to think about durability of our data, the lifespan of the websites we rely on, and the process of software reaching End-Of-Life.
A natural final destination for software is deprecation and deletion. Sure, some mission-critical software will stay in use for decades to come, but even the essential tools of the 70s have been replaced just fifty years later. This is understandable, since computer science is such a young field, but during this evolution, I’m sure many lines of BASIC, Pascal, and all those languages now regarded as ancient have been wiped and lost to time.
There are initiatives that aim to prevent or delay that from happening. The Internet Archive is perhaps the most well-known example of this I can think of. The GitHub Arctic Vault program similarly strives to preserve current-day code as far into the future as possible. In 2020, all active public repos were archived into an Arctic vault, deep in the permafrost. I’ll note, though, that GitHub’s own site points to the possibility of climate change affecting the storage site (albeit only a thin layer of the permafrost under which the vault lies) among other factors.
Even if our data far outlasts us, is there value in holding onto it beyond historical curiosity? I think sentimentality can apply to software too. We all have childhood memories that have deeply affected us, so it makes sense that our data and the software we hold dear would as well. In this post, I’ll primarily discuss personal data and websites, since I’ve seen more examples of them experiencing this slow EOL, but there are parallels to be drawn with binary software and hardware as well.
It’s painful to watch things we care about fade into obscurity, then stop working or lack support, then enter the digital ether. Same goes for people. Just as we reflect on the impact of those who have passed away on our own lives, I think it’s interesting to think about how the makeup of our toolkits will shift as software hits its EOL.
Elementary, my dear Watson
I named my first laptop Watson. I got it in the summer before sixth grade, and used it almost til my senior year of high school. The summer before I started high school, I dual-booted Watson (originally running Windows) with Linux, and I quickly turned to using Linux daily over Windows.
However, at some point after I dual-booted with Linux, my Windows became completely sluggish. Logging in would take about three minutes. When it’d finally load, I’d get a black screen with just a cursor, and even the shortcuts to open Task Manager weren’t working.
About a year ago, my dad’s laptop, a very old Toshiba bought not long after I was born, also started slowing down. I volunteered my old laptop, Watson, for him to use, and set to getting it ready for him.
I was trying to fix the Windows partition of my system while only reformatting what was necessary. I’d transferred over the contents of my Linux partition to my new device, so there was no sentimentality there, but there were still some old screenshots and game files on the Windows side that I’d have preferred to keep. Over the course of a month, I tried to fix Windows without wiping everything. I’d never looked at those files after switching to Linux, and I don’t know why I was so attached to them. In the end, in a Herculean display of rationality, I wiped it all and factory reset my laptop. It works now, and my dad’s happy with his ’new’ machine. I still think about what I’ve lost in that process though — the files that were on that drive that I’ll never get back.
Computers have afforded both digital abundance and digital hoarding. There are few reasons not to just hang onto everything, and it feels wrong to let files go. Just buy a hard disk and copy files over, regardless of if you’ll look at them again. There’s a niggling voice in my head wanting to keep things around for the sake of some nebulous ‘future me’. There are folks that seem to have similar hangups: I have a systems professor who said that every time his disk gets close to filling up, he buys a new laptop with twice as much storage and copies everything over. Per Murphy’s law, it’s worked out okay so far.
The files I deleted on my Windows drive were from my formative years of learning to use a computer. I weirdly sometimes want them back as keepsakes of how much I’ve grown, but I think I’m coming to terms with the fact that knowing I’ve had those experiences is enough. Yet it still feels wrong to hard-delete files, like I’m actively cutting the lifespan of my data short.
Even if you’re not the one pulling the trigger and hitting
Del, I wonder how much data online is continuously being lost passively. It doesn’t make financial sense to retain records forever — data is cheap, but at scale, I realize that storing it can add up. Google is deleting inactive Gmail accounts, and reportedly, so does Instagram. This, more passive, type of deletion is also interesting to consider — it’s like the data is slowly marching towards its deletion.
Today, there are on the order of 120 zettabytes (2^70 bytes) of data in the world, but I don’t know if this statistic accounts for deleted or inactive data. That’s more difficult to estimate, but if there’s 330 million terabytes of data created per day, there’s got to be at least a few million deleted. I wonder what we lose each day. The 330 million terabytes is primarily video data, so there’s probably plenty of old video content in what’s deleted, but perhaps some meaningful software too.
I’m for change and innovation in the web space. As someone who’s blindly thrown ES2023 features into code without worrying too much about browser support or interoperability, I get it. It’s not the fault of web maintainers in any way. Yet it feels odd to have that reminder that things on the Internet aren’t here, in the same state that they are now, accessible in the same ways they are now, forever.
The rest of my devices are fairly up-to-date, so I haven’t suffered too much from device support issues beyond using this old iPad. It’s a good reminder to me, though, that even if software sticks around for a long time, the hardware we have may not always fully support it. GitHub has decided to archive so many repositories in its Arctic Code Vault: in the future, will we even have the hardware to run it? We’ll have to rely on emulators and apocryphal knowledge of processor, browser, and operating system quirks.1
This domain is for sale
To keep software innovating, sometimes we can’t support all devices. To some extent, that’s acceptable. The core functionality still works, we get new devices, the world goes on. But what happens when the app itself goes down?
There’s websites out there that were in a ‘it works now but don’t touch it’ state, abandoned or forgotten. I came across this equine center site that appeared to use to have live horse cams, which are understandably no longer activated, as the center shut down in 2018. I wonder how the site’s still running, five years later. Granted, it’s very Web 1.0, so I doubt there were many critical dependencies, but the domain and hosting’s still up.
Sites like these suffer from something similar to, but not entirely like, bit rot. With bit rot, functionality itself glitches or corrupts, but here it’s like piece by piece of the site fades away. First the outlinks die, then some iframe source no longer connects, and so on. Because of backwards compatibility, most things keep working, so it’s like a slowly decaying time capsule with a ticking timer for when it’ll disappear.
Archive.org does a good job at capturing most sites on the internet to preserve snapshots before they go down, but they don’t capture the full interactivity of pages. Outlinks aren’t captured by default, so even if a page loads, we might not be able to access the websites in its latent space. As well, apps that require authentication won’t get captured. Fifty years down the line, we’ll only have screenshots and video recordings left.
As well, hosting costs someone somewhere money, even if it’s ostensibly offered for free, so it makes sense that apps aren’t hosted forever either. However, it’s a lot easier to justify keeping an app up if it’s running for free, as was possible with Heroku and its competitors. I’d be interested in an estimate of what small percentage of the Internet Heroku took down just by removing its free tier. It’s likely small, but biased towards personal passion projects. I’d think work like this is highly sentimental to their creators, but perhaps not so much so that it’s worth spending extra on hosting2.
Finally, I’d like to consider domain names. It’s impossible to buy a domain name forever — we’re just renting them by the year3. Gilles Castel, notable for their Vim + LaTeX notetaking workflow, unfortunately passed away in 2022, but their site remains up. Perhaps they prepaid for a domain for many years, or perhaps someone close is taking care of the registration for them. I can’t help but wonder what happens as long-held domains and hosting plans expire, particularly for proprietary websites where the original source may be lost entirely.
One open-source project that inspired many of the deprecation-related thoughts in this post is twemoji, Twitter’s signature emoji library, now forked from the prior Twitter open-source project. A few months ago, I saw this issue, noting that there were problems with licensing prior emoji assets for reuse, so the project would stagnate. I use Twemoji as my default emoji font, and I realized that while previously released emojis would still work, more and more emojis would appear as tofus as time went on. I would literally be able to see the font decay.
Happily though, this PR recently got merged with the Unicode 15.0 updates and what I assume was approval from Twitter to continue the project. It’s still a volunteer-run effort, but Discord as a company seems invested in contributing, so I’m optimistic that Twemoji will continue to be updated until the inevitable process of software EOL kicks in yet again.
My dad once very thoughtfully reflected that everything needs maintenance: health, physical objects, relationships. So does software. When there’s no one there, or no more resources available, to maintain it, the metaphorical health of the sites we frequent today will begin to decline. Eventually, we won’t be able to locate them anymore, when their domains expire; to run them anymore, when their backends stop being hosted or our new processors ditch support. I’d reckon they’ll eventually all be deleted as well, despite our best efforts to memorialize and archive them.
As they say, though, such is life. While our software still runs (or hobbles along) for us today, we can learn to appreciate its quirks and the marks it’s left on us. They say life is fleeting, and maybe it’s okay that software is too.
After Heroku announced the end of its free tier, many switched to Fly.io, which also offers a reasonable free tier. I made the switch for my CORS proxy for Matter, but that was because I was actively using Matter. I suspect there were many little tools like Matter that went down just because someone didn’t bother with manually porting things over. ↩︎
‹ go back