Culture Link Rot: How Web Pages Are Lost to Time

L | A
By Matt Hirsch
pexels-junior-teixeira-2047905-1-min-1366x768.jpg

On January 1, 2023, the Internet turned 40. In the forty years since its inception, users have uploaded an unfathomable amount of data to the world wide web. In 2022 alone, people uploaded about 97 zettabytes (97 trillion gigabytes) of new data to the web. And that number increases every year.

Most consider the Internet a sort of modern-day Library of Alexandria. You can find answers to (almost) any question. However, the links to many older pages no longer work. These dead pages have succumbed to a phenomenon known as “link rot.”


What causes link rot, and why is it an important issue?​


According to The Verge, about 72% of links generated in 1998 have succumbed to link rot. Several reasons can cause a URL (uniform resource locator) to stop working and display the dreaded “error 404” message. For example, a web page’s owner could change hosts, the domain name expired, or the site crashed altogether.

So, why is link rot a problem? In 2023, our lives revolve around the Internet. According to Pew Research Center, 85% of Americans say they go online on a daily basis. And nearly a third say they constantly use the web. And especially since the dawn of the social media age, we have used the Internet to connect with friends and family.

In the last decade-plus, we’ve saved many of our fondest memories on Facebook’s (or some other social media site’s) servers. It’ll likely be some time before our old profiles head the way of the dinosaur. However, it will almost inevitably happen (especially if you no longer use the platform).

Link rot also wreaks havoc on journalists, researchers, and academics trying to cite old material. For example, according to Harvard, over 70% of web pages in a law journal study don’t link to original sources. About half of the links in United States Supreme Court opinions studied were rotten. And about three-quarters of the links researchers examined led to content different from what they cited. Additionally, a study by Nanyang Technological University in Singapore showed the issue impacts “.edu” links the most, at 36%.


How can we save our data?​


Several organizations and non-profits are working on archiving old data on the web. The Internet Archive is a digital library founded by computer engineer Brewster Kahle in 1996. The public can freely upload and download data to and from its collection. It also saves old, defunct web pages and allows anyone to access them through its browser, the Wayback Machine. In 2023, there are 811 million old web pages archived on the Wayback Machine.

And in the academic realm, where link rot is a more pressing matter, Perma.cc is the go-to archival service. The Harvard Law School Library Innovation Lab founded the academic archive in 2013 in direct response to the issue. And in 2016, The Institute of Museum and Library Services awarded them a $700,000 grant to expand Perma.cc. It has a crucial difference from the Wayback Machine in that it doesn’t use web crawlers to scour the Internet.

On an individual level, your best bet for saving your digital memories is to store them off the Internet. Social media platforms are increasingly adopting inactive profile deletion policies.
 
This problem has been greatly exacerbated by the influx of tranny jannies destroying everything they see for no reason. Pretty much any site that relies heavily on jannies and isn't overtly hostile to troons ends up with most of its content eventually deleted for breaking some obscure rule on a technicality.

If there's one thing KF excels at, it's that if you see a ten year old link, you can be relatively confident it still works. And that's pretty impressive considering all the different TLDs this site has gone through.
 
The amount of pictures that link to imgur but don't exist anymore is annoying.
Or god forbid, youtube links

ESPECIALLY for OPs from like 2017 and earlier
The Linkara OP, for example, which is just a link to his ED page (which is obviously down because ED cannot exist without changing domain names every couple years), or the Jerry Peet/Lily Orchard OP, which has a lot of youtube links that luckily have not been deleated yet, only unlisted, as well as a lot of ED links too. Thankfully there have been a lot of OP rewrites of old cows done by people like Osama, or Markass or Throbnelius, hopefully they will be moved in soon
 
This is why I always save media I personally enjoy. Because when the inevitable happens and a creator or a group breaks apart, most of the time their works are also lost due to neglect.

FB20xl.png

Pic related. An image made by Tyson Hesse on the group of creators (comics, art, games and videos) that ran Fireball20xl. Short story, it imploded because the webmaster was someone who enjoyed pitting women against each other and was a sleazy perv. The farms has a page on the man.

Notable thing about this site is that it survived for years up until its implosion and there are many talents this group gathered that would go on and become notable. Such as Christopher Niosi who would be a VA later.
 
Killing off personal websites and actual interesting semi-pro ones has been one of the great unnoticed negative effects of social media.

Why run a site about "X" when you can just post "X" to your Facebook page?

This article just reminds me of the half-dozen pages I used to religiously follow in the mid 00's that no longer exist, are no longer updated, or have been absorbed by lefty moralists and edited into impotency in the name of feelings.
 
Sure would be a shame if something happened to the archives.

They let someone off the hook: Google was around then, spidered the entire Internet, and had cached copies of almost all of it. You used to be able to pull up those old dead sites the same way you searched for them. But Google stopped offereng the deep search that let you find these things.

I doubt a company so dependent on knowing everything on the Internet would have deleted that information, they just don't think it's profitable to maintain it. Getting them to allow access or re-enable cache searching and retrieval would be a huge step in addressing link rot and the lost Internet. I would go so far as supporting lawsuits or legislation forcing them to provide it.
 
  • Thunk-Provoking
Reactions: Tropical Cock Man
They let someone off the hook: Google was around then, spidered the entire Internet, and had cached copies of almost all of it. You used to be able to pull up those old dead sites the same way you searched for them. But Google stopped offereng the deep search that let you find these things.

I doubt a company so dependent on knowing everything on the Internet would have deleted that information, they just don't think it's profitable to maintain it. Getting them to allow access or re-enable cache searching and retrieval would be a huge step in addressing link rot and the lost Internet. I would go so far as supporting lawsuits or legislation forcing them to provide it.
Yes, I have put Google cache pages in Archive.today before.

You were still able to use it after they removed the links, but they may have finally disallowed it recently, because doing it manually redirects to this crap:
 
This is why I always save media I personally enjoy. Because when the inevitable happens and a creator or a group breaks apart, most of the time their works are also lost due to neglect.

View attachment 6507826

Pic related. An image made by Tyson Hesse on the group of creators (comics, art, games and videos) that ran Fireball20xl. Short story, it imploded because the webmaster was someone who enjoyed pitting women against each other and was a sleazy perv. The farms has a page on the man.

Notable thing about this site is that it survived for years up until its implosion and there are many talents this group gathered that would go on and become notable. Such as Christopher Niosi who would be a VA later.
god I remember going to Fireball20xl as a practical fetus and someone in the chat linked a sprite comic of a black Tails recolor fucking Amy
I never went back after that... I think that was a good thing
 
Back