archive.today - Arguably the best archive service

notorietus · Mar 25, 2026

archive.today (aka archive.is) is a web archiving website that saves snapshots on demand. It's one of the few websites that saves a wide range of popular websites that are Javascript heavy (such as Google Maps, X & more). Once a web page is archived, it cannot be deleted directly by any Internet user.

It was created on May 16th, 2012 as a similar service to Megalodon (a Japanese archival service created in 2005). The owner of the service goes into detail on why he created it here (archive), stating that it came from the Internet Archive (archive.org) as a seperate service, and created it with the intended goal in mind saying "we won't delete what they delete, and vice versa, even when politics isn't involved", serving as an apolitical archive service:

We created a service similar to Megalodon, which was already quite popular in Japan. First, we had to choose a domain zone. Not the USA or the EU (the current horrors hadn't happened yet, but SOPA and PIPA was already being planned), and not the Caribbean, where a registrar's server could crash and take months to recover. Libya (.ly) was fashionable at the time, but Gaddafi had just been killed. So Iceland seemed interesting: there were bearded sysadmins in parliament, they created Mailpile. Then we looked at which single-word domains were available.

When we started, archive.org didn't have a "Save Now" function, so our features didn't overlap at all. Even our names are different, just homonyms: archive.org is a noun, while we are a verb: "archive.is/today" was intended as an imperative, like "Save Now!"

Then two things happened. First, archive.org introduced its "Save Now" feature. Second, when we finally started communicating—around 2020—Mark mentioned that they come from a background of left-wing activism (this isn't a secret; their biographies are public; I just hadn't looked into them until it was brought to my attention).

By that time, Gamergate and various other scandals had already occurred. With few small exceptions, the right tended to preserve pages, while the left wanted to delete them. That was my aha moment: no collaborations were possible here. And so we became a kind of dialectical pair: we won't delete what they delete, and vice versa, even when politics isn't involved.

This is what's driving us in this direction, toward the role of a smaller archive.org. Whether that's good or bad, I don't know yet.

They archive webpages relatively easily, because archive.today does not obey robots.txt because it acts "as a direct agent of the human user" as stated in their FAQ page. Archive.today launches real browsers (not even headless) and tries to load lazy images, unroll folded content, login into accounts if prompted with login form, remove “subscribe our maillist” popups, etc (source / archive). The site uses Chromium/80 with a few small patches after 2019, before they were using PhantomJS (source / archive). This also allows for archive.today to archive Tor webpages as well as specific IP addresses.

The website is closed source due to various hardcoded problems, meaning it is unlikely it will ever be open source. As stated by the owner:

Unlikely. It has too many hardcoded things specific to my installation. From the type of hardware (like ”that server is too old that it requires kernel-4.4 with a specific patch”) to using a quite exotic operating system.

There is plenty of open-source software in this area: https://github.com/iipc/awesome-web-archiving

(link / archive)

To avoid detection/blocks, archive.today runs via a botnet that cycles through countless IP addresses (using VPS/shithosts), making it quite difficult for webmasters to stop their sites getting archived. For regular websites, they sometimes ask for various ways to get around it. They don't use any residential proxies as "that pays off for ad-traders, but not for a screenshot service".

(link / archive)

archive.today currently has over a petabyte (1000TB) of data with over 500 million webpages archived. So that's archive.today: It has over a petabyte of data and this thread serves as a general discussion thread for anything related to this site.

In 2012, the site already had 10 TB of archives and cost ~300 euros/mo to run, escalating to 2000 euros by 2014 and $4000 by 2016. As of 2021, they have archived on the order of 500 million pages, and with the average size of a webpage clocking in at well over 2 MB these days, that’s a cool 1,000 TB to deal with. (For comparison, the Internet Archive is around 40,000 TB.)

archive.today domains:

archive.today
archive.is
archive.fo
archive.li
archive.vn
archive.ph
archive.md
archive.ec: Lost due to service interruption for 9+ days, domain was resold, not in control of archive.today anymore.

It is intentional.

No single domain is reliable and I have no means to enforce control on each domain.

* archive.today - threatened with confiscation http://blog.archive.today/post/116913927371/the-domain-registrar-gransy-s-r-o-aka, also a troll attack caused service interruption https://blog.archive.today/post/138982909006/domain-problems-again
* archive.is - threatened with confiscation https://twitter.com/archiveis/status/1081276424781287427, asked not to use “archive.IS” for branding (that’s why you see “archive.TODAY” in the top-left corner; although many people remembered it as “archive.IS” and refer it so)
* archive.fo - threatened with confiscation https://twitter.com/archiveis/status/1188222460598116353
* archive.li - attacked by trolls impersonating police, caused few days service interruption https://twitter.com/archiveis/status/956025540028268547
* archive.ec - attacked by trolls causing service interruption and finally lost https://twitter.com/archiveis/status/1093608363647291393
* archive.vn - ok so far
* archive.ph - ok so far
* archive.md - ok so far
* a nice domain unrelated to archive - one day whois started showing someone’s else information and the registrar did not response, the domain was lost

archive.today socials:

Twitter
Archive.today Tumblr (includes major list of technical questions/related downtime/etc/general help
Blog
Liberapay
Email (webmaster@archive.today)

thirstytux · Mar 31, 2026

A grand post, notorietus.

My only gripe is I wish it worked 100% of the time without JavaScript enabled, no big deal though.

Oh, I noticed this in the news awhile back:

"FBI orders domain registrar to reveal who runs mysterious Archive.is site Tucows subpoenaed in criminal probe for info on "customer behind archive.today."

- https://arstechnica.com/tech-policy...o-unmask-mysterious-founder-of-archive-today/

johnny johnny yes papa · May 5, 2026

My only complaint is the long queue times.

Hellwalker · May 5, 2026

Is anyone else having severe problems with archiving with archive.today? By severe, I mean can't even fucking archive anything anymore. I don't know if this is because the owner is still having that war with some Finnish retard or whatever drama that I couldn't give less of a shit about but my current problems with it are:
- I constantly get a reCAPTCHA. When opening the site, when clicking any on-site link, when trying to archive a link, anything. It's fucking insufferable.
- I can't even archive shit anymore from my experience. The loading page you get redirected is just an infinite loop now. I've waited quite a bit and I get no change.

Just these 2 problems make the service unusable to me. I'm forced to use only Ghostarchive now.

For the record, I am using a VPN and have a number of privacy extensions on but turning these off seem to fail to make a difference, along with other usual methods of trying to solve an issue with a site like clearing cookies and whatever.

I hope anyone else here are having the same problems as me. I hate this bullshit.

Lasagna4Dead · May 5, 2026

Hellwalker said:
Is anyone else having severe problems with archiving with archive.today? By severe, I mean can't even fucking archive anything anymore. I don't know if this is because the owner is still having that war with some Finnish retard or whatever drama that I couldn't give less of a shit about but my current problems with it are:
- I constantly get a reCAPTCHA. When opening the site, when clicking any on-site link, when trying to archive a link, anything. It's fucking insufferable.
- I can't even archive shit anymore from my experience. The loading page you get redirected is just an infinite loop now. I've waited quite a bit and I get no change.

Just these 2 problems make the service unusable to me. I'm forced to use only Ghostarchive now.

For the record, I am using a VPN and have a number of privacy extensions on but turning these off seem to fail to make a difference, along with other usual methods of trying to solve an issue with a site like clearing cookies and whatever.

I hope anyone else here are having the same problems as me. I hate this bullshit.

domains have been seized and shut down for me

Fresh Meat · May 5, 2026

Lasagna4Dead said:
domains have been seized and shut down for me

i trans heart spreading misinformation

maybe, they're blocked by your ISP / DNS provider, but they are definitively not "seized"

Jemn Oopi · May 5, 2026

notorietus said:
archive.today

Other threads:

Jemn Oopi said:
Archive.today: Operator uses users for DDoS attack

Margo Martindale said:
Cloudflare flags archive.today as "C&C/Botnet"; no longer resolves via 1.1.1.2

archive.today - Arguably the best archive service

notorietus

Informative

archive.today domains:

archive.today socials:

thirstytux

johnny johnny yes papa

Hellwalker

Somebody up there likes me

Lasagna4Dead

You ready to feel my frickin' grip?

Fresh Meat

pepperoni

Jemn Oopi

archive.today - Arguably the best archive service

Informative

archive.today domains:​

archive.today socials:​

Somebody up there likes me

You ready to feel my frickin' grip?

pepperoni

archive.today domains:

archive.today socials: