Archival Tools - How to archive anything.

Here's a useful tool. Install GPAC from here. Works on Windows, macOS, Linux, iOS, and Android.


The tool I'm interested in is a command-line tool called MP4Box. You use it like this

$ MP4Box -split-size 100000 BigFile.mp4

It will split BigFile.mp4 into BigFile_001.mp4, BigFile_002.mp4 ... BigFile_nnn.mp4, each less than the 100MB upload limit here. It does it on keyframes and without re-encoding so it's fast and, unlike FFMPeg doesn't leave weird bits of sound with no video.

https://trac.ffmpeg.org/wiki/Seeking#Seekingwhiledoingacodeccopy

Using -ss as input option together with -c:v copy might not be accurate since ffmpeg is forced to only use/split on i-frames. Though it will—if possible—adjust the start time of the stream to a negative value to compensate for that. Basically, if you specify "second 157" and there is no key frame until second 159, it will include two seconds of audio (with no video) at the start, then will start from the first key frame. So be careful when splitting and doing codec copy.

Also unlike FFMpeg you don't need to work out the timestamps by hand.
 
I just want to ask what I should do his the video that I wanna put in my post is too big for the player on kw?
 
I just want to ask what I should do his the video that I wanna put in my post is too big for the player on kw?
I post to archive.org which doesn't usually take videos down (and automatically makes a torrent). If it's really important (ex. Christchurch shooting) ask Null to make a torrent or make it yourself
 
  • Informative
Reactions: JeanActimel
Latest ytdl

Bash:
function ytdl
{
    if [[ $1 == "mp3" ]];
    then
        youtube-dl --extract-audio --audio-format mp3 --audio-quality 128K -f m4a $2
    elif [[ $1 == "m4a" ]];
    then
        youtube-dl -f m4a $2
    elif [[ $1 == "jpg" ]];
    then
        youtube-dl --write-thumbnail --skip-download $2
    elif [[ $1 == "-F" ]];
    then
        youtube-dl -F $2 | grep mp4
    else
        youtube-dl --embed-thumbnail -f "bestvideo[height<=${1}][ext=mp4]+bestaudio[ext=m4a]/best[height<=${1}][ext=mp4]" "${2}" "-o%(title)s-%(id)s-%(height)sp.%(ext)s"
    fi
}

Somewhat bloated. You can do

ytdl [mp3|m4a|jpg] url to get mp3, m4a or a jpg thumbnail or

ytdl resolution url to get video <= resolution as an mp4. E.g

ytdl 360 url for 360p video or
ytdl 720 url for 720p

Refactored with case...esac. Also support for aac audio, which seems to be supported here and avoids re-encoding as mp3

Bash:
function ytdl
{
    case $1 in

    mp3)
        youtube-dl --extract-audio --audio-format mp3 --audio-quality 128K -f m4a $2
        ;;

    aac)
        youtube-dl --extract-audio --audio-format aac -f m4a $2
        ;;

    m4a)
        youtube-dl -f m4a $2
        ;;

    jpg)
        youtube-dl --write-thumbnail --skip-download $2
        ;;

    -F)
        youtube-dl -F $2 | grep mp4
        ;;

    *)
        youtube-dl --embed-thumbnail -f "bestvideo[height<=${1}][ext=mp4]+bestaudio[ext=m4a]/best[height<=${1}][ext=mp4]" "${2}" "-o%(title)s-%(id)s-%(height)sp.%(ext)s"
        ;;
    esac
}

Consider this video

Bash:
$ ytdl -F https://www.youtube.com/watch?v=ZZWMxwcFZOc
140          m4a        audio only tiny  130k , m4a_dash container, mp4a.40.2@128k (44100Hz), 7.28MiB
160          mp4        256x144    144p  111k , avc1.4d400c, 30fps, video only, 3.91MiB
133          mp4        426x240    240p  245k , avc1.4d4015, 30fps, video only, 8.20MiB
134          mp4        640x360    360p  516k , avc1.4d401e, 30fps, video only, 15.27MiB
135          mp4        854x480    480p  805k , avc1.4d401f, 30fps, video only, 24.67MiB
136          mp4        1280x720   720p 1196k , avc1.4d401f, 30fps, video only, 45.16MiB
298          mp4        1280x720   720p60 3473k , avc1.4d4020, 60fps, video only, 136.98MiB
137          mp4        1920x1080  1080p 4332k , avc1.640028, 30fps, video only, 137.90MiB
299          mp4        1920x1080  1080p60 6890k , avc1.64002a, 60fps, video only, 258.01MiB
18           mp4        640x360    360p  638k , avc1.42001E, 30fps, mp4a.40.2@ 96k (44100Hz), 35.88MiB
22           mp4        1280x720   720p  932k , avc1.64001F, 30fps, mp4a.40.2@192k (44100Hz) (best)

If you do this

ytdl 720 https://www.youtube.com/watch?v=ZZWMxwcFZOc

The last version will get the 60fps version which is 136.98MiB, not the 30fps version which is 45.16MiB which for archiving here is probably not what you want. Here's a hack to only get videos under 60fps.

Bash:
function ytdl
{
    case $1 in

    mp3)
        youtube-dl --extract-audio --audio-format mp3 --audio-quality 128K -f m4a $2
        ;;

    aac)
        youtube-dl --extract-audio --audio-format aac -f m4a $2
        ;;

    m4a)
        youtube-dl -f m4a $2
        ;;

    jpg)
        youtube-dl --write-thumbnail --skip-download $2
        ;;

    -F)
        youtube-dl -F $2 | grep mp4
        ;;

    *)
        youtube-dl --embed-thumbnail -f "bestvideo[height<=${1}][fps<60][ext=mp4]+bestaudio[ext=m4a]/best[height<=${1}][ext=mp4][fps<60]" "${2}" "-o%(title)s-%(id)s-%(height)sp.%(ext)s"
        ;;
    esac
}

I noticed the tool doesn't work with Bitchute

Bash:
$ ytdl 1080 https://www.bitchute.com/video/mSfD78emB7CQ/
[BitChute] mSfD78emB7CQ: Downloading webpage
[BitChute] mSfD78emB7CQ: Checking video URL
ERROR: requested format not available

This is probably because Bitchute, being uncouth, doesn't return any metadata at all and only allows one download format

Bash:
$ youtube-dl -F https://www.bitchute.com/video/mSfD78emB7CQ/
[BitChute] mSfD78emB7CQ: Downloading webpage
[BitChute] mSfD78emB7CQ: Checking video URL
[info] Available formats for mSfD78emB7CQ:
format code  extension  resolution note
0            mp4        unknown

youtube-dl has a conditional operators for missing metadata


Formats for which the value is not known are excluded unless you put a question mark (?) after the operator. You can combine format filters, so -f "[height <=? 720][tbr>500]" selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s.

You can modify ytdl like this - just add a '?' after each operator so it will be ignored if the metadata is not present

Bash:
function ytdl
{
    case $1 in

    mp3)
        youtube-dl --extract-audio --audio-format mp3 --audio-quality 128K -f m4a $2
        ;;

    aac)
        youtube-dl --extract-audio --audio-format aac -f m4a $2
        ;;

    m4a)
        youtube-dl -f m4a $2
        ;;

    jpg)
        youtube-dl --write-thumbnail --skip-download $2
        ;;

    -F)
        youtube-dl -F $2 | grep mp4
        ;;

    *)
        youtube-dl --embed-thumbnail -f "bestvideo[height<=?${1}][fps<?60][ext=mp4]+bestaudio[ext=m4a]/best[height<=?${1}][ext=mp4][fps<?60]" "${2}" "-o%(title)s-%(id)s-%(height)sp.%(ext)s"
        ;;
    esac
}

At which point it works

Bash:
$ ytdl 1080 https://www.bitchute.com/video/mSfD78emB7CQ/
[BitChute] mSfD78emB7CQ: Downloading webpage
[BitChute] mSfD78emB7CQ: Checking video URL
[BitChute] mSfD78emB7CQ: Downloading thumbnail ...
[BitChute] mSfD78emB7CQ: Writing thumbnail to: Ethan Who-mSfD78emB7CQ-NAp.jpg
[download] Destination: Ethan Who-mSfD78emB7CQ-NAp.mp4
[download] 100% of 48.77MiB in 00:09
[atomicparsley] Adding thumbnail to "Ethan Who-mSfD78emB7CQ-NAp.mp4"
 
Last edited:
The archive.today guy is now in a pissing match with the Brave team. Try loading the site with Brave and you get this:
1.png

Before being redirected to this url:

Apparently it's been nearly 2 months of Brave support not responding to his emails about transferring his urls between accounts to receive BAT again.
2.png

Edit 6/23/20: The Brave redirect has been removed and archive.today is working normally again.
 
Last edited:
Dumb question, but is there something wrong with using the archive dot today url?
 
Do you use brave? Archive recently had a slapfight with brave browser over some monetization fuckery, it was very recently fixed.
no, not that, but it seems like there's an autofilter to change it from archive(.)today to archive.md if you write it here
 
I'm not sure if it's been posted in here, but I think I've seen it posted elsewhere on the site;
With the increase of cows having discord servers, I thought it would be useful to bring up the Discord Chat Exporter. I'm sure there's other options out there, but here's a link to one.
It's fairly simple to install and has a GUI. It'll turn any chatroom of you pick into an .html link, opening it on any browser and you can view the whole Discord chat room's history.

On a different note with finding usernames quickly online and less to do with archival there's a thing called "sherlock project" that will look through close to 200 websites for users with the same username. Kiwi use to be on that list before Null hid profiles.
I had difficulty installing this one (thank you @Yotsubaaa) since I couldn't find clear instructions on how to get it in on windows. but you can search the username that you're looking for and it'll make a text documents with the URL's to their indexed websites.
It should also go without saying but sometimes people are dumb; this isn't going to give you the correct person 100% of the time. You'll still need to look through each link to make sure they're who you're looking for.
 
Last edited:
I'm running into an error trying to upload a video. The video is in MP4 format and the file size is 150mb.
1594664993841.png
Can someone tell me what I'm doing wrong? Sorry for my n00bery, y'all!
 
I'm running into an error trying to upload a video. The video is in MP4 format and the file size is 150mb.
Can someone tell me what I'm doing wrong? Sorry for my n00bery, y'all!

100mb is the limit. Either try reprocessing it in a lower resolution or split it in half and render it as two videos.
 
Looks like the domain for this workaround has been bought out and it no longer works.
Does anyone have a good way to archive an entire instagram page (aside from a massive screenshot)? I don't need the comments, just want to capture the entire page.

Edit: I took a look at how the Amy Ramadan thread archived her instagram, and noticed that "?hl=en" was on the end of both of those URLs. I added that to the address I'm trying to archive and it looks like it's going thru. I'll update if anything changes, but it looks like I'm just a re.tard, carry on y'all.

Edit2: it didn't work.
1595452530465.png
 
Last edited:
Looks like the domain for this workaround has been bought out and it no longer works.
Does anyone have a good way to archive an entire instagram page (aside from a massive screenshot)? I don't need the comments, just want to capture the entire page.
Archive.today doesn't work anymore with profile pages, only individual image pages. The best you can do is archive instagram frontend mirrors like this (https://archive.vn/kIDob) and it wont archive every single image on said profile.
 
archive.md doesn't work anymore with profile pages, only individual image pages. The best you can do is archive instagram frontend mirrors like this (https://archive.vn/kIDob) and it wont archive every single image on said profile.
I was just reading that post on the FAQ, you just saved me from searching for any more information on there. Thanks again, @BlancoMailo . You da real MVP.

Edit: I used https://gramho.com/, it seemed to do a better job capturing the images. Figured I'd post that here if anyone in the future is facing the same issues.
 
Last edited:
While this is bumped I had another question. The archive.md guy is apparently very unstable.
Are there any contingency plans for important cow content in the event that site ever folds?
I know Null has enough shit going on with targets on his back, but perhaps a kiwifarms-owned archiving service to back up thread OP links (text only if needed for space reasons) would be a good future-proofing strategy? I'm aware people outside the farms will claim the content is faked but what else is new.
 
  • Optimistic
Reactions: anameisaname
Back