Archival Tools - How to archive anything.

GenociderSyo · May 15, 2021

Been spending a lot of time archiving instagram stories for a few threads and found these three websites to be very helpful:
Story Saver (Pulls up current stories and highlights)
InGramer (Can also be used to pull videos off regular instagram posts)
Picuki

There used to be one that could pull off private instagrams, but it is not working anymore. There also used to be a trick where you could save complete website on instagram stories to pull the vid/image files, but at moment this is only working sometimes for image files.

For youtube and Tikok:
4K Downloader (I use the pay version so do not know the limitations of the free one)

The Emperor Skeksis · May 15, 2021

GenociderSyo said:
Been spending a lot of time archiving instagram stories for a few threads and found these three websites to be very helpful:
Story Saver (Pulls up current stories)
InGramer (Can also be used to pull videos off regular instagram posts)
Picuki

There used to be one that could pull off private instagrams, but it is not working anymore. There also used to be a trick where you could save complete website on instagram stories to pull the vid/image files, but at moment this is only working sometimes for image files.

For youtube and Tikok:
4K Downloader (I use the pay version so do not know the limitations of the free one)

I have the free version of 4k downloader and my limit is 30 downloads a day. Useful enough.

BlancoMailo · May 15, 2021

GenociderSyo said:
Been spending a lot of time archiving instagram stories for a few threads and found these three websites to be very helpful:
Story Saver (Pulls up current stories)
InGramer (Can also be used to pull videos off regular instagram posts)
Picuki

There used to be one that could pull off private instagrams, but it is not working anymore. There also used to be a trick where you could save complete website on instagram stories to pull the vid/image files, but at moment this is only working sometimes for image files.

For youtube and Tikok:
4K Downloader (I use the pay version so do not know the limitations of the free one)

Why not use 4K for Instagram as well? I got it for $10 off during their Christmas sale, saves a lot of trouble with the auto saving feature. I personally just use yt-dl for youtube, though.

5t3n0g0ph3r · Jun 9, 2021

Here is a tool I've been using to download YouTube videos:

Youtube Video Downloader - Download and Save YouTube Videos - 9covert.com

The fastest free YouTube video downloader. Download and save YouTube video for free in best quality from our website.

9convert.com

Fìddlesticks · Jun 9, 2021

Ragged Beef said:
Here's a method for archiving Reddit accounts. I know jack shit about coding but I can do it so you can too. I stole this off of Voat, everyone's favorite alt-right Reddit alternative.

Reddit data is available on BigQuery
https://bigquery.cloud.google.com/table/fh-bigquery:reddit_comments.2015_11

Click on "Compose Query" and paste the following:

SELECT
id
,link_id
,parent_id
,subreddit
,author
,score
,STRFTIME_UTC_USEC(created_utc*1000000,"%Y/%m/%d %H:%M:%S") AS CreatedOnUTC
,"http://www.reddit.com/comments/" + SUBSTR(link_id,4) + "/_/" + id AS URL
FROM
[fh-bigquery:reddit_comments.2007]
,[fh-bigquery:reddit_comments.2008]
,[fh-bigquery:reddit_comments.2009]
,[fh-bigquery:reddit_comments.2010]
,[fh-bigquery:reddit_comments.2011]
,[fh-bigquery:reddit_comments.2012]
,[fh-bigquery:reddit_comments.2013]
,[fh-bigquery:reddit_comments.2014]
,[fh-bigquery:reddit_comments.2015_01]
,[fh-bigquery:reddit_comments.2015_02]
,[fh-bigquery:reddit_comments.2015_03]
,[fh-bigquery:reddit_comments.2015_04]
,[fh-bigquery:reddit_comments.2015_05]
,[fh-bigquery:reddit_comments.2015_06]
,[fh-bigquery:reddit_comments.2015_07]
,[fh-bigquery:reddit_comments.2015_08]
,[fh-bigquery:reddit_comments.2015_09]
,[fh-bigquery:reddit_comments.2015_10]
,[fh-bigquery:reddit_comments.2015_11]
,[fh-bigquery:reddit_comments.2015_12]
,[fh-bigquery:reddit_comments.2016_01]
,[fh-bigquery:reddit_comments.2016_02]
,[fh-bigquery:reddit_comments.2016_03]
,[fh-bigquery:reddit_comments.2016_04]
,[fh-bigquery:reddit_comments.2016_05]
,[fh-bigquery:reddit_comments.2016_06]
,[fh-bigquery:reddit_comments.2016_07]
,[fh-bigquery:reddit_comments.2016_08]
,[fh-bigquery:reddit_comments.2016_09]
,[fh-bigquery:reddit_comments.2016_10]
,[fh-bigquery:reddit_comments.2016_11]
,[fh-bigquery:reddit_comments.2016_12]
,[fh-bigquery:reddit_comments.2017_01]
,[fh-bigquery:reddit_comments.2017_02]
,[fh-bigquery:reddit_comments.2017_03]
,[fh-bigquery:reddit_comments.2017_04]
,[fh-bigquery:reddit_comments.2017_05]
,[fh-bigquery:reddit_comments.2017_06]
,[fh-bigquery:reddit_comments.2017_07]
,[fh-bigquery:reddit_comments.2017_08]
,[fh-bigquery:reddit_comments.2017_09]
,[fh-bigquery:reddit_comments.2017_10]
,[fh-bigquery:reddit_comments.2017_11]
,[fh-bigquery:reddit_comments.2017_12]
,[fh-bigquery:reddit_comments.2018_01]
,[fh-bigquery:reddit_comments.2018_02]
,[fh-bigquery:reddit_comments.2018_03]
,[fh-bigquery:reddit_comments.2018_04]
,[fh-bigquery:reddit_comments.2018_05]
,[fh-bigquery:reddit_comments.2018_06]
,[fh-bigquery:reddit_comments.2018_07]
WHERE author = 'username' ORDER BY CreatedOnUTC

Important:
You will neeed to add more of those ",[fh-bigquery:reddit_comments.20xx_xx]" lines, depending on the date you do this. Check how far the archive goes, and add lines accordingly.
On the last line, change the 'username' to the username of the account you want to archive. The username is case-sensitive and do not delete the apostrophes.

When you're done, run the query and wait until the archival is complete. When it ends it'll present you different methods to download the account's history. Easiest method imo is to download the data as an Excel file.

Pros:

You don't have to archive a shit ton of pages on archive.md, this method archives thousands of comments at once.

Reddit hides posts that are older than 1 year or so on profile pages. This method bypasses that.

You can customize the query as you wish, given you know how to use this crap (I don't).

Cons:

Last 2-3 months of posts are missing. You still need to archive the last couple of pages of an account through archive.md. USE THE OLD. DOMAIN (old.reddit.com) or the account is archived with the redesign and it looks horrible, sometimes even completely broken.

You need to login to your Google account to use BigQuery (as it's a Google service), so you can not access this data anonymously. I don't believe other users can see your activity, but Google certainly can.

Does anyone have an update to this? I have tried but I get a syntax error as snipped below. I have 0 skills so all i have tried is just removing the , and the line itself but it says similar.

SqXuSR · Jun 9, 2021

Is there an automated way to archive someone's entire Reddit account?

NarcissusRyuichi · Jun 9, 2021

Is there a good service to dump a bunch of Patreon videos to? They're currently hosted on yandex.disk which works but it is pretty slow. I understand that Null is open to hosting large files but I don't know if it's worth bothering him if there's a good alternative.

totse · Jun 12, 2021

Fìddlesticks said:
Does anyone have an update to this? I have tried but I get a syntax error as snipped below. I have 0 skills so all i have tried is just removing the , and the line itself but it

In a "SELECT ... FROM ... WHERE ..." the part after "FROM" is the database tables to look in. Try replacing the [ ] with backticks ` ` like it says (leave the commas)

Fìddlesticks · Jun 13, 2021

totse said:
In a "SELECT ... FROM ... WHERE ..." the part after "FROM" is the database tables to look in. Try replacing the [ ] with backticks ` ` like it says (leave the commas)

Thank you. That seems obvious now.

I have now changed this and now I have an error that says this, apologies for the hand holding, I understand nwo I need to change the : but what as, i tried a . but that didnt work so I am back to asking.

@SqXuSR - trying to use this method but as you can see, boomer issues ahoy

totse · Jun 13, 2021

Try replacing the colon, but escape the existing period as \. since it's probably interfering

If not then I'm not sure, I was trying to go see the syntax they're using now but the bigquery link redirects to a new version of the site and I can't find anything there

awoo · Jun 30, 2021

awoo said:
btw, if you guys have time, you should try to save some pages at https://archive.org/web/
most are already saved at archive.md but it won't hurt to have backups

btw: selecting "save outgoing links" saves every linked page, including the previous and next two pages

note: "save outgoing links" is only for IA members.

I found out you can just do this from a not-so-publicized API by curling https://web.archive.org/save/$url where $url is the url of the page you want to save.

Example, retrieving only headers. A successful save should say HTTP response code 302 at the top and give you the location of the saved page at the line that says location:

Bash:

curl -I "https://web.archive.org/save/https://kiwifarms.net/threads/the-great-twitter-meltdown-of-2021.93623/page-102"

Cavalier Cipolla · Jul 24, 2021

sugoi-chan said:
The easiest way to take small (read: non-fullscreen) screenshots in Windows is to click Start > type "snippingtool" into the run box > draw a box around what you want to snip and it'll take a screenshot of it.

Then just click the copy button and paste the image into your message as an attachment. No mucking about with Imgur or uploading it to an external site.

I personally use Lightshot for screenshots on desktop. I used it since like 2014-2015. You just press PrScr and then you select the area, and you can even draw cocks ontop, write "nigger" if just typing it isn't enough and draw arrows as if it was an r/arabfunny screenshot. Sometimes when I'm to lazy to download a Google image, I screenshot it with Lightshot.

Kromer Merchant · Aug 14, 2021

Does anyone have an archive of Null's youtube dl guide vid?

Ellesse_warrior · Aug 14, 2021

Kraz said:
Does anyone have an archive of Null's youtube dl guide vid?

Kromer Merchant · Aug 14, 2021

Ellesse_warrior said:
View attachment 2446198

Danke

bippu_as_fuck_ls400 · Oct 29, 2021

What's the best way to get a mirror of this blogspot on my laptop that I can browse locally? Is wget still the way to go?

Null · Oct 30, 2021

Youtube-dl has not updated in 4 months because Susan went after them hard. Use yt-dlp now.

Releases · yt-dlp/yt-dlp

A youtube-dl fork with additional features and fixes - yt-dlp/yt-dlp

github.com

Hard Toothbrush · Oct 30, 2021

Also use yt-dlp if, like me, you got 50kb/s download speed using youtube-dl, with yt-dlp I get full speed.

awoo · Oct 30, 2021

The main guy doing releases is this Russian guy dstftw who appears to have disappeared. I wonder what happened to him.

totse · Nov 5, 2021

bippu_as_fuck_ls400 said:
What's the best way to get a mirror of this blogspot on my laptop that I can browse locally? Is wget still the way to go?

Code:

wget -mkEpnp https://pleasantfamilyshopping.blogspot.com/

Code:

--mirror – Makes (among other things) the download recursive.
--convert-links – convert all the links (also to stuff like CSS stylesheets) to relative, so it will be suitable for offline viewing.
--adjust-extension – Adds suitable extensions to filenames (html or css) depending on their content-type.
--page-requisites – Download things like CSS style-sheets and images required to properly display the page offline.
--no-parent – When recursing do not ascend to the parent directory. It useful for restricting the download to only a portion of the site.

yoinked

Testing it, 250MB so far

Edit:

Code:

┌─[0]─[ec2-user@althalus]─[~/web/pleasantfamilyshopping.blogspot.com]
└── $ ls 2009/01
before-they-drove-old-dixie-down.html
before-they-drove-old-dixie-down.html?showComment=1231568580000.html
before-they-drove-old-dixie-down.html?showComment=1231597740000.html
before-they-drove-old-dixie-down.html?showComment=1231628700000.html
before-they-drove-old-dixie-down.html?showComment=1231647240000.html
before-they-drove-old-dixie-down.html?showComment=1231648800000.html
before-they-drove-old-dixie-down.html?showComment=1231705080000.html
before-they-drove-old-dixie-down.html?showComment=1231762860000.html
before-they-drove-old-dixie-down.html?showComment=1231797780000.html
before-they-drove-old-dixie-down.html?showComment=1231816200000.html
before-they-drove-old-dixie-down.html?showComment=1231860600000.html
before-they-drove-old-dixie-down.html?showComment=1231874640000.html
before-they-drove-old-dixie-down.html?showComment=1231896720000.html
before-they-drove-old-dixie-down.html?showComment=1231956060000.html
before-they-drove-old-dixie-down.html?showComment=1231957860000.html
before-they-drove-old-dixie-down.html?showComment=1231965420000.html
before-they-drove-old-dixie-down.html?showComment=1232220900000.html
before-they-drove-old-dixie-down.html?showComment=1232250240000.html
before-they-drove-old-dixie-down.html?showComment=1232473200000.html
before-they-drove-old-dixie-down.html?showComment=1232518380000.html
before-they-drove-old-dixie-down.html?showComment=1236868080000.html
before-they-drove-old-dixie-down.html?showComment=1251435228935.html
before-they-drove-old-dixie-down.html?showComment=1251931958910.html
before-they-drove-old-dixie-down.html?showComment=1257957470234.html
before-they-drove-old-dixie-down.html?showComment=1258261854675.html
before-they-drove-old-dixie-down.html?showComment=1259803348458.html
before-they-drove-old-dixie-down.html?showComment=1267881282783.html
before-they-drove-old-dixie-down.html?showComment=1269228912362.html
before-they-drove-old-dixie-down.html?showComment=1285339903068.html
before-they-drove-old-dixie-down.html?showComment=1289840940876.html
before-they-drove-old-dixie-down.html?showComment=1311353847289.html
before-they-drove-old-dixie-down.html?showComment=1311690238264.html
before-they-drove-old-dixie-down.html?showComment=1314047838456.html
before-they-drove-old-dixie-down.html?showComment=1320947353308.html
before-they-drove-old-dixie-down.html?showComment=1320948768575.html
before-they-drove-old-dixie-down.html?showComment=1327209217291.html
before-they-drove-old-dixie-down.html?showComment=1327335790145.html
before-they-drove-old-dixie-down.html?showComment=1329949431320.html
before-they-drove-old-dixie-down.html?showComment=1348232834403.html
before-they-drove-old-dixie-down.html?showComment=1355109991391.html
before-they-drove-old-dixie-down.html?showComment=1358267814600.html
before-they-drove-old-dixie-down.html?showComment=1416711246397.html
before-they-drove-old-dixie-down.html?showComment=1425612209033.html
before-they-drove-old-dixie-down.html?showComment=1425612414173.html
family-affair-at-kroger.html
happy-new-year.html
index.html
very-fashionable-kroger-1966_25.html

uh ok just a minute lol

Code:

wget -mkEHpnp -R "*?showComment*" -D "pleasantfamilyshopping.blogspot.com,1.bp.blogspot.com,2.bp.blogspot.com,3.bp.blogspot.com,4.bp.blogspot.com" https://pleasantfamilyshopping.blogspot.com/

Adds:
-H traverse hosts
-R reject
-D domains to follow

You can monitor for yourself while it runs:

Code:

watch -n1 "du -hs /path/to/directory"

It seems to be doing what it's supposed to except it's taking a while because, as the linked blogger notes, it downloads the unwanted pages and then throws them away

Yeah that last one is good, final count 417MB with everything, 86MB for just the stuff on the pleasantfamilyshopping domain

Paste:

Code:

printf '\n\nblogget() { \nwget -mkEHpnp -R "*?showComment*" -D "$1,1.bp.blogspot.com,2.bp.blogspot.com,3.bp.blogspot.com,4.bp.blogspot.com" $1 \n}\n' >> ~/.bashrc ; source ~/.bashrc

Result:

Code:

blogget() {
wget -mkEHpnp -R "*?showComment*" -D "$1,1.bp.blogspot.com,2.bp.blogspot.com,3.bp.blogspot.com,4.bp.blogspot.com" $1
}

Use:

Code:

blogget https://pleasantfamilyshopping.blogspot.com/

Archival Tools - How to archive anything.

GenociderSyo

Syo

The Emperor Skeksis

We will never be dust

BlancoMailo

5t3n0g0ph3r

Resident Archivist

Youtube Video Downloader - Download and Save YouTube Videos - 9covert.com

Fìddlesticks

Don't click the spoiler

SqXuSR

NarcissusRyuichi

"Consent Accident"

totse

Not a spook, probably

Fìddlesticks

Don't click the spoiler

totse

Not a spook, probably

awoo

Please be patient, I have awootism

Cavalier Cipolla

Dai Vesuvio, regalaci un'altra Pompei!

Kromer Merchant

Every day pizza day

Ellesse_warrior

Kromer Merchant

Every day pizza day

bippu_as_fuck_ls400

Junction Produce Lifestyle

Null

Ooperator

Releases · yt-dlp/yt-dlp

Hard Toothbrush

The White Flame Dancing on the Graves of his Foes

awoo

Please be patient, I have awootism

totse

Not a spook, probably