Archival Tools - How to archive anything.

Gog & Magog · Mar 26, 2025

CEO of Gay said:
I haven't seen any signs of malware yet, but will be doing a scan.

On clearnet, I get straight to the main archive.ph page with no CAPTCHA, and can search and archive with no issues.

I wonder if something is wrong with the CAPTCHA redirect, then. In chat, @Gog & Magog said he could not get past the CAPTCHA while on Tor, if I understood him right.

EDIT: No scan results yet, but I get the same behavior on my phone. Clearnet: no CAPTCHA, no redirect to rurtnews.com. TOR: CAPTCHA, which redirects to rurtnews.com

I doubt this is the doing of malware. Now why would a virus redirect affect only one website on one browser?

Markass the Worst · Mar 26, 2025

CEO of Gay said:
EDIT: No scan results yet, but I get the same behavior on my phone. Clearnet: no CAPTCHA, no redirect to rurtnews.com. TOR: CAPTCHA, which redirects to rurtnews.com

~~If it also happens on your phone then it's possibly an issue with your network or router.~~

I HAVE A RADIO · Mar 26, 2025

CEO of Gay said:
I have not tested this from the clear net nor another VPN.

Same issue on clearnet + VPN. I try to solve captcha and it redirects me to RT after a few seconds.

clipartfan92 · Mar 26, 2025

CEO of Gay said:
I have not tested this from the clear net nor another VPN.

I've got the same problem. I've tested with two different devices, using two different connections, through two different VPN companies. When you get to the CAPTCHA screen, after it sits five seconds it redirects to the https://rurtnews.com/ site.

I'm a Silly · Mar 26, 2025

The RT re-direct happened for the 1st time tonight. I closed the tab and retried, then everything was normal. It happened 1x per ~5 archives I made in that one session.

Markass the Worst · Mar 27, 2025

Holy fuck, you're right. It's happening to me too. What the fuck is going on?

It's inconsistent though. Sometimes I can solve the CAPTCHA and get through so it's not like this makes archiving impossible.

The Mass Shooter Ron Soye · Mar 27, 2025

Is it an official RT domain? That would normally be rt.com.

Lol:

https://archive.ph/https://rurtnews.com/

HahaYes · Mar 27, 2025

Hooked a browser into zaproxy and did some looking into it, kinda does look like they've been hijacked going by what their reverse proxy is returning in the response requests across multiple TLDs:

The captcha doesn't even get a chance to be loaded or completed before it redirects over to rurtnews. Additionally, my captchas are showing up in fucking squiggle language of all things, despite me using a VPN subnet nowhere near any middle eastern geolocation:

CEO of Gay · Mar 27, 2025

Markass the Worst said:
~~If it also happens on your phone then it's possibly an issue with your network or router.~~

Normally I would agree with you. I should have mentioned that I deliberately left my house and made sure my phone was on cellular data with wifi disabled for that test, specifically to determine whether it was my network. I see this is affecting you as well now.

The Mass Shooter Ron Soye said:

LOL at archiving degeneracy being interrupted by CAPTCHA redirects. Who knew Russians liked feet that much?

I don't know if that's an official domain for Russia Today or not. While the IP address is the same, the name servers are different. rt.com's DNS records are hosted by rttv.ru, whereas rurtnews.com's DNS records are hosted by megafon.ru. Of note is that runewsrt.com also points to the same IP and is also hosted by megafon.ru, and was registered 7 days ago. Perhaps that means nothing and is only a coincidence, but I found it interesting.

HahaYes said:
Hooked a browser into zaproxy and did some looking into it, kinda does look like they've been hijacked going by what their reverse proxy is returning in the response requests across multiple TLDs:

I should have thought to check for that. I see that now too, looking at the response headers in Tor Browser's console.

Although this has been going for over 6 hours how, I sent an email to the Archive Today webmaster in case the weren't aware of what's going on.

@Null, is this worth a feature or some kind of notice/warning header, or should I dig into this more and start another thread so I can stop shitting up this one?

Null · Mar 27, 2025

CEO of Gay said:
@Null, is this worth a feature or some kind of notice/warning header, or should I dig into this more and start another thread so I can stop shitting up this one?

If you want to make an I&T thread go for it

AnOminous · Mar 27, 2025

CEO of Gay said:
On clearnet, I get straight to the main archive.ph page with no CAPTCHA, and can search and archive with no issues.

I got exactly this redirect a couple hours ago, which was a little disconcerting considering it's recaptcha. It's not doing that now, though, although I'm now getting the "bad goy" kind of captcha where it gives you eight shitty blurry challenges in a row before letting you in.

Reddit also mentions this: https://www.reddit.com/r/DataHoarde...ivetoday_redirecting_to_a_weird_russian_news/

And of course the top rated comment is absolute retardation thinking the OP was talking about the Wayback Machine instead.

CEO of Gay · Mar 27, 2025

AnOminous said:
I got exactly this redirect a couple hours ago, which was a little disconcerting considering it's recaptcha. It's not doing that now, though, although I'm now getting the "bad goy" kind of captcha where it gives you eight shitty blurry challenges in a row before letting you in.

Reddit also mentions this: https://www.reddit.com/r/DataHoarde...ivetoday_redirecting_to_a_weird_russian_news/

And of course the top rated comment is absolute retardation thinking the OP was talking about the Wayback Machine instead.

The webmaster for Archive Today replied to my email right before I posted in I&T. It's a bug on his part.

LOL at a reddit nigger confusing Archive Today with the Wayback Machine.

Colon capital V · Mar 27, 2025

Anyone have any tips on archiving Vimeo video pages? I can download videos from it just fine, but when I try and archive a video link for stuff like upload date and description, it gets cucked by Cloudflare on both archive.today and ghostarchive.

Tread Miller · Apr 5, 2025

Has anyone else had any issues using GhostArchive to archive stuff recently? I can view stuff just fine, but I want to archive some X/Twitter chains and GhostArchive kept giving me errors. When I tried it, I tried switching VPNs but it still kept giving me errors. It also doesn't seem to be an X/Twitter issue, as I tried to archive some random website to test and it still didn't work.

I can still use archive.md to archive just fine, it's just one of the chains I want to archive is like 12 Tweets and I don't really want to archive each one individually.

The Mass Shooter Ron Soye · Apr 5, 2025

Tread Miller said:
Has anyone else had any issues using GhostArchive to archive stuff recently?

Yes, very recently I am getting:

Archiving error
There was an issue trying to archive your webpage or video. Usually, webpages that are bigger than 50 megabytes, or videos longer than 15 minutes, may fail to archive.

But the unusual part is that this error is appearing nearly instantly instead of after e.g. a minute like it used to, which may indicate something is up behind the scenes.

For tweets you can try replacing the account name with "i" to grab more context (I do this from the Archive.today page instead of using the bookmarklet I'm clicking all day every day). Or maybe one of the Nitter instances still works.

https://x.com/tracewoodgrains/status/1907505225281450297

https://x.com/i/status/1907505225281450297

https://archive.ph/BsNrS

Tread Miller · Apr 5, 2025

The Mass Shooter Ron Soye said:
Or maybe one of the Nitter instances still works.

This works, using nitter.space allowed me to archive the *11-Tweet long chain on archive.md

The Mass Shooter Ron Soye · Apr 8, 2025

Did Archive.today stop using "run=1"? That's in the bookmarklet to make it automatically start archiving the URL. Now it opens the page with the URL pre-filled but I have to submit the form.

The Mass Shooter Ron Soye · Apr 18, 2025

The Mass Shooter Ron Soye said:
Did Archive.today stop using "run=1"? That's in the bookmarklet to make it automatically start archiving the URL. Now it opens the page with the URL pre-filled but I have to submit the form.

It's not the worst thing in the world because I can check the URL for cruft, replace X account name with "i", etc.

Dumb Wings · Apr 23, 2025

ferrari superamerica said:
How do I archive an entire twitter account?

1st option offline-twitter https://offline-twitter.com/
simple and clean and good for everyday single use. But when I tried to archive an account it didn't go back all the way to the first tweets by a few months so there might be limitations and I have no clue about the progress of the development. Also it's a bit late to getting same day tweets unless you liked it. It's local host web ui that looks this. Site has instructions for install and starting. Not open source afaik so uh probably not malicious

2nd option gallery-dl
full proof but not pretty
Using this reddit set up. link [A]
I do prefer "filename": "twitter_{tweet_id}_{author[name]}_{num}.{extension}", instead because by sorting by name I sort by tweet id which sorts it by date.
Don't remember the options and defaults but you can look them up here. https://gdl-org.github.io/docs/configuration.html ctrl-f twitter
Then my commands look like this because it's used for a current active account.
gallery-dl --directory ".\Twitter\test1" "https://twitter.com/search?q=from:test1&f=live" --write-metadata --abort 5
gallery-dl --directory ".\Twitter media\test1 images" "https://twitter.com/test1/media" --write-metadata --abort 5
gallery-dl --directory ".\#test1" "https://twitter.com/hashtag/test1&f=live" --write-metadata --abort 5
Then ask chatgpt to make a program to convert it to a pretty version. I had some luck a year and a half ago asking it to make a python program to make it into a html file that looks like twitter. I didn't quite like the result so I deleted it. Last attempt was a year ago so it probably got better.
Not sure if I set it up the best way so maybe the reddit one is better.
3rd option wait for nitter to add it. It's on their roadmap but it's been there for a while.

The Mass Shooter Ron Soye · Apr 27, 2025

https://kiwifarms.st/threads/hasan-piker-hasanabi.95834/post-21254482

If you didn't know, you can archive individual images with Archive.today. This can be useful when the highest quality version of an image is sitting on the server but not used on an HTML page.

For the New York Times, inspecting the page at the image shows a source set with additional URLs, one of which appears to be the "master" copy. So I archived that. On the archive page, you should be getting the highest quality original with no webp compression, so you can right click and save it, or open in a new tab to zoom in easier.

When it comes to archiving individual files with Archive.today, the one that has given me the most trouble is probably PDFs. Archiving that produces a snapshot of the first page and no more AFAICT. Yeah, here's an example someone did from NYT. Ghostarchive has better handling of PDFs if it doesn't reject it based on the size.

Archival Tools - How to archive anything.

Gog & Magog

I didn't need to see that.

Markass the Worst

don't do stance, kids

I HAVE A RADIO

clipartfan92

Award Winning

I'm a Silly

𝖋𝖎𝖗𝖊 𝖋𝖎𝖗𝖊 𝖋𝖎𝖗𝖊

Markass the Worst

don't do stance, kids

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

HahaYes

Cruising along leisurely

CEO of Gay

Null

Ooperator

AnOminous

SOMEBODY SET UP US THE BOMB

CEO of Gay

Colon capital V

Loudest, biggest, most nuclear-size Brap above me

Tread Miller

Custom title:

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

Archiving error

Tread Miller

Custom title:

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

Dumb Wings

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

Archival Tools - How to archive anything.

I didn't need to see that.

don't do stance, kids

Award Winning

𝖋𝖎𝖗𝖊 𝖋𝖎𝖗𝖊 𝖋𝖎𝖗𝖊

don't do stance, kids

You CAN'T NOT DO IT!

Cruising along leisurely

Ooperator

SOMEBODY SET UP US THE BOMB

Loudest, biggest, most nuclear-size Brap above me

Custom title:

You CAN'T NOT DO IT!

Archiving error​

Custom title:

You CAN'T NOT DO IT!

You CAN'T NOT DO IT!

You CAN'T NOT DO IT!

Archiving error