Archival Tools - How to archive anything.

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
web.archive.org works, but you can't be certain it will always be there. Still worth making a copy anyway.

Mentioning archive.org, I tried to create a new account recently and noticed they don't accept most cock.li domains anymore. I created a number of accounts last year with no problem, but this time it only accepted cock.email and none of the other alternate domains. It might be a good time to create a handful while that option still exists.
 
Mentioning archive.org, I tried to create a new account recently and noticed they don't accept most cock.li domains anymore. I created a number of accounts last year with no problem, but this time it only accepted cock.email and none of the other alternate domains. It might be a good time to create a handful while that option still exists.
Many sites these days ban cock.li. It's worth looking into similar privacy respecting alternatives to get around the blocks, there is a list here https://offshore.cat/email
 
I found out how to get to the dedicated Disqus pages for these articles. Unfortunately, not all of the 957 comments will load, unless there's something further that can be done with the URL (probably not). But you can at least archive the newest and oldest comments instead of the highest rated:
Best (archive)
Newest (archive)
Oldest (archive)
I was looking for a way to separately archive Disqus comment threads months ago, and I found it for one of Alyssa Mercante's college articles.

To get at it, I looked at the source, searched for "disqus", and found "s.src = '//pipedream.disqus.com/embed.js';" where "pipedream" is the site identifier. I think Brave AI told me how to get this URL format listing all the site's articles:
"disqus.com/home/forum/pipedream/"
And then I just scrolled until I got to around 13 years ago and saw the article I wanted:
"disqus.com/home/discussion/pipedream/binghamtons_four_noble_truths_the_way_i_lived_them_pipe_dream/"

The second identifier does not match the article's URL, but it does match the page title with punctuation stripped, spaces replaces with underscores, and lowercase. So that's probably correct. I don't know what it would do if a site has multiple with the same title. Maybe add a number at the end.

Why bother? It allowed more comments to be archived, since there are different URLs for the best, newest, and oldest comments. But I still can't get at all of them. It's also possible that Disqus threads may be accessed even if the page where they were embedded is deleted.
 
Seeing this is the most active thread on technology ig this is the best place to ask (sorry if it isn't thread related but bare with me)
Is there any way to archive an entire discord server without the owners/mods/users realize what happened?
Like make an 1:1 copy of all the messages of the server in a way it can readed all the way from the beginning to the last message?
if there a thread for this kind of questions i would glady go ask there if directed
in any case thanks.
 
Seeing this is the most active thread on technology ig this is the best place to ask (sorry if it isn't thread related but bare with me)
Is there any way to archive an entire discord server without the owners/mods/users realize what happened?
Like make an 1:1 copy of all the messages of the server in a way it can readed all the way from the beginning to the last message?
if there a thread for this kind of questions i would glady go ask there if directed
in any case thanks.
There's a thread about archival tools. People over there have recommended DiscordChatExporter before, and it still seems maintained, but you should know Discord doesn't appreciate you using anything but their botnet client, and it's also a violation of their terms of service. But also no indication anyone was ever banned for using that tool specifically. There's also browser extensions, since we're on topic.. but I'd be very careful around those.
 
There's a thread about archival tools. People over there have recommended DiscordChatExporter before, and it still seems maintained, but you should know Discord doesn't appreciate you using anything but their botnet client, and it's also a violation of their terms of service. But also no indication anyone was ever banned for using that tool specifically. There's also browser extensions, since we're on topic.. but I'd be very careful around those.
Thank you kindly,
i will be wary with those extensions
 
Also is there a way for the user that extracted the logs to be tracked-down? (I.E PC identified)
Asking for Opsec reasons.
 
Also is there a way for the user that extracted the logs to be tracked-down? (I.E PC identified)
Asking for Opsec reasons.
Not inherently. However, if you for example export all the channels that a user has access to and some of them are restricted by role, then it can be inferred that the user who exported it must have had a certain set of roles. Whether that's enough to narrow it down to you is hard to say without knowing further details.

If you're using DiscordChatExporter, you should also make sure to export in UTC (--utc in command line) to remove your timezone and en-SE locale (--locale en-SE) to remove your locale and use iso8601 date format.
 
Did Archive.today stop using "run=1"? That's in the bookmarklet to make it automatically start archiving the URL. Now it opens the page with the URL pre-filled but I have to submit the form.
It's not the worst thing in the world because I can check the URL for cruft, replace X account name with "i", etc.
It seems to have been suddenly fixed today. I got caught off guard because it wasn't automatically starting just a few hours ago.

I decided to remove it for the moment:
JavaScript:
javascript:void(open('https://archive.ph/?run=1&url=%27+encodeURIComponent(document.location)))
JavaScript:
javascript:void(open('https://archive.ph/?url=%27+encodeURIComponent(document.location)))
 
So, in trying to ensure all the X/Twitter links that retarded niggers post to USPG2 are archived and not lost to time, I am noticing that Archive Today does not seem to archive X/Twitter as well as it once did. You can get the text of the immediate post, but quoted posts, replies, pictures, and media are usually missing. Ghost Archive running in Web Worker mode seems to work well, managing to replay the page exactly as it would had you visited it directly, but the web worker mode doesn't work in Tor Browser, Firefox Private Mode, or Chrome Private Mode. I've only been able to make it work in stock Chrome, and of course, I'd rather not feed Google information about my drama-browsing habits. Ghost Archive's static mode isn't much better than Archive Today.

This leads to my next thought: Since some sites work better with some archive tools than they do with others, do other Kiwis think it would be worth compiling and maintaining a list of known troublesome sites, and listing which archive tools work better/worse with those sites?
 
Archive Today does not seem to archive X/Twitter as well as it once did
This dates back to, I believe, Twitter requiring people to be logged-in to read anything more than single tweets.

For threads, I've just been using threadreaderapp.com and then archiving the thread page there. Bear in mind that the Twitter API rules require API users to delete posts that are deleted on Twitter, so a link to threadreaderapp.com is no better than a link to the currently-live tweet.

Obviously it doesn't include replies but it handles embedded media well.
 
I am noticing that Archive Today does not seem to archive X/Twitter as well as it once did.
If you replace the username in the url with "i" then Archive Today will grab some replies too.
I've only been able to make it work in stock Chrome, and of course, I'd rather not feed Google information about my drama-browsing habits.
Have you tried using regular (non-private) Firefox? It's what I use if I ever need to use web workers.
 
Code:
javascript:void(window.open('https://archive.today/?run=1&url=%27+location.href).opener=null)
Code:
javascript:void(window.open('https://ghostarchive.org/save/'+location.href).opener=null)
Code:
javascript:void(window.open('https://web.archive.org/save/'+location.href).opener=null)
Code:
javascript:void(window.open('https://preservetube.com/save?url='+location.href).opener=null)
edit: from wiki, for those wondering how to use these

Visit the webpage you want to archive and click the appropriate bookmarklet you just created.
To add to those, here's a bookmarklet for archiving at megalodon.jp:
Code:
javascript:void(window.open('https://megalodon.jp/pc/main?url='+location.href).opener=null)
To search:
Code:
javascript:void(window.open('https://megalodon.jp/?url='+location.href).opener=null)
If you don't speak moon runes, solve the cloudflare challenge and click the blue button on the left to archive the page:
megalodon.webp
 
Back