Archival Tools - How to archive anything.

Is there a way to archive Twitter dms or just screen caps?
Depends on your 'use case'.

I have used this tool (some technical aptitude required) to capture DM chains from group chats etc. However, have not tried it since the UI changes. Might be broken. Captures a text record & photos/videos from the convos.

If you want it going through a third party for a little external verification, you could try webrecorder.io, following the instructions here to ensure you don't give away your phone # tied to the account. I haven't verified that it works for DMs, especially with the new UI, but it's got about the best chance of working of any archive tool. Note that if you want to avoid revealing what your username is on there, you'd want to change your username before logging in to Twitter.
 
I'm having problem with archive today not archiving news stuff but working for other stuff. Any way for me to circumvent this or use a new archiving service that is similar to archive today?
 
  • Thunk-Provoking
Reactions: Arm Pit Cream
Anyone know of any programs that can help with corrupted video files? Was trying to record a stream for a potential thread yesterday and sharex decided to shit the bed at the end with a window full of error messages when I tried to stop the recording.
 
Is anyone else having problems with YouTube-DL for windows? I've uninstalled and reinstalled and it just keeps saying I've already downloaded the video I'm trying to download
 
Is anyone able to access the archive.today? I'm getting both a connection refused or your Internet access is blocked error. I don't think it's on my end. I checked and double checked my firewall. I could initially get around it by using a non US server but even that sometimes fails now.
 
Is anyone able to access the archive.today? I'm getting both a connection refused or your Internet access is blocked error. I don't think it's on my end. I checked and double checked my firewall. I could initially get around it by using a non US server but even that sometimes fails now.

Working fine for me.


I sometimes get that, though. Sometimes seems to block VPNs possibly because people DDoS the site and/or run bots from them.
 
I have used this a few times for miscellaneous things and it works okay. It is basically a simple frontend for the internet archive's crawler software known as Heritrix. It can be simple or if you want to get your hands dirty you can get into the Heritrix config stuff and do specific stuff with your crawls.

The program is called the Web Archiving Integration Layer (WAIL) - Link
 
Not sure if this is the best place to ask, but for stuff that is too voluminous to add to this site (eg. 150 gb of Chantal videos, etc), does anybody have any better suggestions than splitting it across a ton of MEGA accounts? MEGA is very user-friendly and has good speeds/restrictions but its claim of 50 Gb free storage is a lie, it's 15 with 35 for only the first month, which is useless.
 
Is there a good, free way to archive someone's entire tweet history? Archive.is only snapshots their Twitter timeline down to the first point where it has to load more tweets. Scrolling to the beginning and using Brave's "Save Page As..." function creates an HTML file that just loads as a blank background with a bunch of blown-up button images and no text. Saving the fully-loaded timeline as a PDF creates a file with the appropriate number of pages, but the text of the tweets stops after the first two pages or so.
 
Is there a good, free way to archive someone's entire tweet history? archive.li only snapshots their Twitter timeline down to the first point where it has to load more tweets. Scrolling to the beginning and using Brave's "Save Page As..." function creates an HTML file that just loads as a blank background with a bunch of blown-up button images and no text. Saving the fully-loaded timeline as a PDF creates a file with the appropriate number of pages, but the text of the tweets stops after the first two pages or so.
yes, you can scroll to the bottom, then use Full Page Screen Capture to capture the entire page. twitter unloads content that you arent viewing or some shit, but this physically scrolls down so its reloaded

also, speaking of twitter archiving, i made a little userscript to add a convenient button to tweets to archive them

JavaScript:
// ==UserScript==
// @name         Twitter Archive Button
// @namespace    http://tampermonkey.net/
// @match        https://twitter.com/*
// @grant        GM_addStyle
// ==/UserScript==

GM_addStyle(`.archiveButton{
    position: absolute;
    right: 0px;
    bottom: 0px;
    color: grey;
    background: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAANCAYAAACgu+4kAAADUXpUWHRSYXcgcHJvZmlsZSB0eXBlIGV4aWYAAHjarZdhkuMoDIX/6xR7BCQQguNgMFV7gz3+PjBOJ5lMd2ZiUzZYYEl+n3AqtP/3b6d/cHAKjoJaijlGhyPkkKVgkNxx5HllF+Z13bhz8GAn19aEwOTR++PWylpfYNevB84YvD3aKa0ZScvRGXI59COyYNDuk4RdDjuH5SjvxyDmZPepbnL0dS2cqawzMB9SrGDjnh4MBpWaIpAX2T17N6/hyMAfZ8HJuMIuY+QxVi80O16ZQJCH17sp6+4FehD5HNGz+qf2z+JLWSv8k5bxpBZfT7C+Fn9KfBfY3zKSx4meJ9TH11ln7y31vh9vV0KEonFVlKNTnemkt21Jn5BZdIZTMbbZMlpyxVUgb666Da1yZgGVThy4ceHO++wrV6QYZBdDL1LFT1vyJlnqJBZG4y7ms28+gWWVnYAveLnlwjNunvEqJ0RujKXCcMZ45LeNvpv8k0a91yERDzHboRXyklHXSGOQG1esAhDui5tOgc+28Lu7wkKpgqBOmRNesLjtcLEpf9WWn5w91in6YwsxWVsO5pZximTYg4CL7JUjOxMxZuiYAKggc8He2ECAVaUhSQneRyGTJCM2njGea0UlyjDj2wQQ6qM3sMm+AFYIivqxkFBDRb0GVY1qmkizluhjiBpjtDg+csW8BVOLZpYsW0k+haQpJksp5VSyZI9voOaYLaeccylCBYEKfBWsL7BssvktbLrFzba05a1UlE8NVWusVlPNtTRpvuEz0WKzllpuZWfa8aXYw6573G1Pe95LR61130PXHrv11HMvN2qL6i/tD6jxoiaT1FhnN2qwktnpgsfnRAczEJPAIG6DAApaBjOXOAQZ5AYzlwWbQgVJ6mBDjQcxIAw7i3a+sfsi9xY30vQWN/mJHA10V5AjoPuV2wtqbfzO1Uns2IVDU+ex+7CmSCKczuHyaf+RI79/meiCZN5y5Pfe3/JF1yj0raM7BV7kda/P6OkCfWYY+j7Oj9ncZukz6F/uqJdrcqJPC/E00c883qf2dtF9F4zeifZOTnTJTruE2u3VLqnrJ/x/m83nGt2FoSuy+UyjpzB0RTZ/r9GLMPReNj/39GkR3e01/Iri3xz9D+anbdK9jwFiAAABhmlDQ1BJQ0MgcHJvZmlsZQAAeJx9kTtIw1AUhv+mSkWqDnYQcchQH4MFURFHrUIRKoRaoVUHk5s+hCYNSYqLo+BacPCxWHVwcdbVwVUQBB8gTo5Oii5S4rlJoUWMBy7347/n/7n3XEColZhmtY0Bmm6bqURczGRXxNArAuhGGMMYkZllzEpSEr71dU/dVHcxnuXf92d1qTmLAQGReIYZpk28Tjy1aRuc94kjrCirxOfEoyZdkPiR64rHb5wLLgs8M2KmU3PEEWKx0MJKC7OiqRFPEkdVTad8IeOxynmLs1aqsMY9+QvDOX15ieu0BpDAAhYhQYSCCjZQgo0Y7TopFlJ0Hvfx97t+iVwKuTbAyDGPMjTIrh/8D37P1spPjHtJ4TjQ/uI4H4NAaBeoVx3n+9hx6idA8Bm40pv+cg2Y/iS92tSiR0DPNnBx3dSUPeByB+h7MmRTdqUgLSGfB97P6JuyQO8t0Lnqza1xjtMHIE2zSt4AB4fAUIGy13ze3dE6t397GvP7AXK7cqfIpiNlAAAABmJLR0QA/wD/AP+gvaeTAAAACXBIWXMAACE3AAAhNwEzWJ96AAAAB3RJTUUH5AEODTQ0Dvg5pgAAAI5JREFUKM/dkiEOwkAQRd+QNTg0Z8A1XAzHCViF6i3QWNIzELgCGtEE+TDUbEvSFsfI+f+/mUwm1DvwYF6tUZuZYdQmAa16AZ7Aa2R2CayAtiPVajVhcqXWAGlABDgCm0K6RcSu9KcvQ7bAvugdhow9QESg5oEN8ijAB3Iee48FP9YfALojXoGsTvnEE8AbyJYyvh90XDsAAAAASUVORK5CYII=');
    background-repeat: no-repeat;
    width: 16px;
    height: 16px;
    opacity: 0.5;
    margin: 10px;
}

.archiveButton:hover{
    opacity: 1;
}`);

var base = "http://archive.today/?run=1&url=";

function update(){
    let tweets = document.querySelectorAll("article");

    for(let tweet of tweets){

        if(tweet.querySelector(".archiveButton"))
            continue;

        let link = tweet.querySelector("[href*='/status/']");
        if(!link)
            continue;
        link = link.href.split("/").slice(0, 6).join("/");

        let button = document.createElement("a");

        button.className = "archiveButton";
        button.href = base + encodeURIComponent(link);
        button.target = "_blank";

        tweet.appendChild(button);
    }

}

var observer = new MutationObserver(function(mutations){
    update();
});

observer.observe(document.body, {childList:true, subtree:true});
 
Is there a good, free way to archive someone's entire tweet history? archive.li only snapshots their Twitter timeline down to the first point where it has to load more tweets. Scrolling to the beginning and using Brave's "Save Page As..." function creates an HTML file that just loads as a blank background with a bunch of blown-up button images and no text. Saving the fully-loaded timeline as a PDF creates a file with the appropriate number of pages, but the text of the tweets stops after the first two pages or so.
Twint does a great job of archiving tweets to CSV, JSON, a SQLite DB etc, though you need to do a little work at the command line and install a recent version of Python3.
https://github.com/twintproject/twint/

I don't believe it has a good way to automatically save images associated with a post, however.
 
Twint does a great job of archiving tweets to CSV, JSON, a SQLite DB etc, though you need to do a little work at the command line and install a recent version of Python3.
https://github.com/twintproject/twint/

I don't believe it has a good way to automatically save images associated with a post, however.

use twint to get a csv, then you can use this
python3 archivePics.py tweets.csv folder
Python:
import csv
from sys import argv
import requests
import json

tweets = csv.DictReader(open(argv[1]))

for tweet in tweets:
    pics = json.loads(tweet["photos"].replace("'", '"'))
    for pic in pics:
        r = requests.get(pic)
       
        open(argv[2] + "/" + pic.split("/media/")[1], "wb").write(r.content)
should work fine
 
Last edited:
Personally, I like Greenshot and ShareX. Where would the Internet be without all its history, after all? A poorer and likely, much less funnier place. I also like pastebin/hastebin for when you absolutely must get it down.
 
Does anyone know of a tool that I can use to archive multiple pages on archive.today?

Essentially i have a list of url's I'd like to archive, I tried archiving via a browser and it looks like theres a queue sort of thing, so its gonna take forever, being able to just run a script and have it save the archived links and let it buzz along in the background would be amazing.

i know there are scripts/programs that can do this for pastebins and image upload sites that will upload a text file, and save/return a link to the paste and a deletion link, maybe I can rework one of those for this purpose, but it'd be convenient if there already was a tool to do this.

update- i starred something on github forever ago that looks to fit the bill.

for anyone looking for a way to automate archiving massive amounts of shit, https://github.com/pastpages/archiveis looks pretty useful.
 
Last edited:
I've made a little .bat script for easy youtube-dl archiving.

After unzipping the ytdl folder, run the youtube-dl.bat file. It has three options: Archive, Update and Exit.

To choose an option, press the number given to the option.

If you choose Archive, you will then be asked to give a URL. Simply paste the URL of the video you want to archive and press Enter. If you wish to archive few videos at once, simply paste all the links into the URL field, separating them with spaces.

If nothing wants to download, simply choose Update, as the main reason for the video downloads failing is YouTube changing something, causing youtube-dl to be broken until the next update.

If you want to exit, you can choose the third option, or just close the window.

All the videos get downloaded to the "downloads" folder. They get saved in a specific hierarchy. First, a folder with the name of the uploader. Then, a folder with the name of the video. In it, you will find the video file with the upload date in square brackets at the beginning of the name, as well as the video thumbnail, video description, auto subtitles, and user made subtitles, if they are available.

I believe it's simple enough for anyone to use, as it only involves pressing numbers and pasting links. Unless people here panic at the sight of a command line, then I can't help much, as youtube-dl is a command line tool, and I don't have the experience to make GUI tools.
 

Attachments

Last edited:
I usually just use this for youtube-dl, which I ripped from somewhere like stackexchange.

Code:
youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4'

It just gets the best video and audio format available and puts it all in an mp4.
 
Does anyone know how to download/archive an "Instagram Live" video? Most IG downloader apps only work if the user has saved the video to their "story", but the one I am after was only a livestream and it expires about 20 hours from now. Worst case scenario I'll use a screen recorder app but that will wreck the quality especially as most of the content I want to record is not so much the video as things happening in the on-screen comments, which I need to be readable as a consequence.

Any pointers? Windows or Android. Running out of time here!
 
Back