youtube-dl DMCA'd by the RIAA - RIAA and MPAA are on a mass takedown spree

  • 🔧 Actively working on site again.
To continue the Streisand effect going on with Youtube-dl, I setup a mirror of my own. The VPS it's hosted on is paid for 3 months and should be good for a while.
Because companies always uphold contractual obligations in the face of a legal threat or hate mob.
 
I did a thing.

To continue the Streisand effect going on with Youtube-dl, I setup a mirror of my own. The VPS it's hosted on is paid for 3 months and should be good for a while. Feel free to add it to the list of mirrors in the OP, @Sage In All Fields.


Susan WojCWCKi has got Youtube to break 2020.09.20 though. The developers need to work on it, and they can't do that because the RIAA has killed their github.

Someone cleverly used an unfixed bug in github to attach the youtube-dl source code to the DMCA request

https://twitter.com/lrvick/status/1320246266270519297
1604155171832.png


https://github.com/github/dmca/tree/416da574ec0df3388f652e44f7fe71b1e3a4701f
https://archive.vn/T9k6f

So the RIAA might have to DMCA the repo containing their own DMCA request

1604155270504.png
 
Last edited:
E.g. this video, which ironically contains Youtube CEO Susan WojCWCki being killed, now can't be downloaded.
Also if anyone thinks this isn't a big deal because it will only affect youtube-dl, 99% of youtube downloaders like streamable and clipconverter depend on youtube-dl to get videos, and they're already being affected by this whole shitty situation.

js.PNG


cc.PNG


Seems like Youtube has added some weird extra DRM to videos with copyrighted music to cover their ass from the RIAA, and I get the feeling none of the downloader sites are going to try and bypass it because then the RIAA will go after them.
 
Last edited:
If you look at the error in 2020.09.20 I'm pretty sure it happens here. This is from

youtube-dl/youtube_dl/extractor/youtube.py

Python:
                if cipher:
                    if 's' in url_data or self._downloader.params.get('youtube_include_dash_manifest', True):
                        ASSETS_RE = r'"assets":.+?"js":\s*("[^"]+")'
                        jsplayer_url_json = self._search_regex(
                            ASSETS_RE,
                            embed_webpage if age_gate else video_webpage,
                            'JS player URL (1)', default=None)
                        if not jsplayer_url_json and not age_gate:
                            # We need the embed website after all
                            if embed_webpage is None:
                                embed_url = proto + '://www.youtube.com/embed/%s' % video_id
                                embed_webpage = self._download_webpage(
                                    embed_url, video_id, 'Downloading embed webpage')
                            jsplayer_url_json = self._search_regex(
                                ASSETS_RE, embed_webpage, 'JS player URL')

On the last line

If you look go here and monitor network requests in Chrome

https://www.youtube.com/watch?v=feky-ahvmOk

You can see it pulls video from. I've changed my IP to xx.xx.xx.xx and flipped a few random chars in the parameters for privacy reasons.

Code:
https://r1---sn-aigzrn7d.googlevideo.com/videoplayback?expire=1604183566&ei=rpGdX4YZNcfSxgKQgbDQDw&ip=xx.xx.xx.xx&id=o-ABaXMBBW5hilVS9DQWumjQt1J2RC6J1sUikzEf97Rgrm&itag=251&source=youtube&requiressl=yes&mh=ip&mm=31%2C26&mn=sn-aigzrn7d%2Csn-5hnf6ns6&ms=au%2Conr&mv=m&mvi=1&pl=24&initcwndbps=1248750&vprv=1&mime=audio%2Fwebm&gir=yes&clen=16004935&dur=1032.161&lmt=1604037124811303&mt=1604161841&fvip=1&keepalive=yes&c=WEB&txp=6311222&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cgir%2Cclen%2Cdur%2Clmt&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIhAL_KQUivCpd16OnfbzvBpAD9PBTbJd9SQx_njH5xLBSNAiAIt5_SAdACLCmCZj2KIrzmlzB5Nk8w11UIOU8a2CvnfQ%3D%3D&alr=yes&sig=AOq0QJ8wRQIhAPvZwOID2IOEbdSyMHRWN4BHagHPiSKNtJPiHFY4bybpAiBH2QNZtreCdbUccaaBzCkexDDZJ6hfI16AUk3mZTWxWw%3D%3D&cpn=rvcqTzOwmriT_vpS&cver=2.20201029.02.00&range=2096280-2608079&rn=18&rbuf=64254

Basically, it seems like Google's scheme is that they've got Javascript code which works out the above URL with some magic 'key' parameters. Fixing youtube-dl would require reverse engineering this URL.

I think the RIAA has pushed Google to implement this scheme to break youtube-dl and then DMCA'd youtube-dl's repo as soon as it went live to stop people reverse-engineering it and fixing youtube-dl.

unzip NyanYT.png -d NyanYTExt
 

Attachments

Last edited:
Because companies always uphold contractual obligations in the face of a legal threat or hate mob.

The host I went with tends to be more defensive of activities like this instead of giving into pressure. I'm not as concerned as long as I have contact with the owner of the host, which I do.
 
  • Optimistic
Reactions: Kosher Salt
If you look at the error in 2020.09.20 I'm pretty sure it happens here. This is from

youtube-dl/youtube_dl/extractor/youtube.py

The yt-dlc fork changes the lines as follows, perhaps it will fix the problem you are experiencing as the pull request is mentioned in many issues with people having the problem:

Python:
                        ASSETS_RE = r'"assets":.+?"js":\s*("[^"]+")'

Python:
                        ASSETS_RE = r'(?:"assets":.+?"js":\s*("[^"]+"))|(?:"jsUrl":\s*("[^"]+"))'

I attached my youtube.py file for the lazy, it may or may not have other hotfixes in it, I tend to add them when releases take too long, but don't usually keep track if I've done so or not.
 

Attachments

Last edited:
The yt-dlc fork changes the lines as follows, perhaps it will fix the problem you are experiencing as the pull request is mentioned in many issues with people having the problem:

Python:
                        ASSETS_RE = r'"assets":.+?"js":\s*("[^"]+")'

Python:
                        ASSETS_RE = r'(?:"assets":.+?"js":\s*("[^"]+"))|(?:"jsUrl":\s*("[^"]+"))'

I attached my youtube.py file for the lazy, it may or may not have other hotfixes in it, I tend to add them when releases take too long, but don't usually keep track if I've done so or not.

Interesting. If I run youtube-dlc I get

Code:
~/yt-dlc
$ ./youtube-dlc -F https://www.youtube.com/watch?v=feky-ahvmOk
[youtube] feky-ahvmOk: Downloading webpage
[info] Available formats for feky-ahvmOk:
format code  extension  resolution note
249          webm       audio only tiny   67k , opus @ 50k (48000Hz), 6.24MiB
250          webm       audio only tiny   86k , opus @ 70k (48000Hz), 7.95MiB
140          m4a        audio only tiny  131k , m4a_dash container, mp4a.40.2@128k (44100Hz), 15.93MiB
251          webm       audio only tiny  160k , opus @160k (48000Hz), 15.26MiB
278          webm       256x144    144p  133k , webm container, vp9, 30fps, video only, 8.17MiB
160          mp4        256x144    144p  157k , avc1.4d400c, 30fps, video only, 8.50MiB
242          webm       426x240    240p  282k , vp9, 30fps, video only, 13.24MiB
133          mp4        426x240    240p  349k , avc1.4d4015, 30fps, video only, 18.23MiB
243          webm       640x360    360p  529k , vp9, 30fps, video only, 22.59MiB
134          mp4        640x360    360p  711k , avc1.4d401e, 30fps, video only, 34.36MiB
244          webm       854x480    480p  914k , vp9, 30fps, video only, 36.23MiB
135          mp4        854x480    480p 1202k , avc1.4d401f, 30fps, video only, 60.01MiB
247          webm       1280x720   720p 1968k , vp9, 30fps, video only, 67.56MiB
136          mp4        1280x720   720p 2253k , avc1.64001f, 30fps, video only, 110.72MiB
302          webm       1280x720   720p60 3274k , vp9, 60fps, video only, 93.82MiB
298          mp4        1280x720   720p60 3804k , avc1.640020, 60fps, video only, 152.37MiB
303          webm       1920x1080  1080p60 4776k , vp9, 60fps, video only, 142.73MiB
299          mp4        1920x1080  1080p60 6322k , avc1.64002a, 60fps, video only, 269.96MiB
18           mp4        640x360    360p  379k , avc1.42001E, 30fps, mp4a.40.2@ 96k (44100Hz), 46.73MiB
22           mp4        1280x720   720p 1028k , avc1.64001F, 30fps, mp4a.40.2@192k (44100Hz) (best)

This was just building it from source

git clone https://github.com/blackjack4494/yt-dlc
cd yt-dlc/youtube-dlc
make

I did a

make install

And it copied it to /usr/local/bin on my Mac, just like it was a proper Unix machine rather than an abomination of soy and authoritarianism.
 

If you do end up staying on the yt-dlc fork, it was started several weeks ago due to youtube-dl taking too long to incorporate features and pull requests. One recent addition for people who enjoy archiving, is it now supports downloading the youtube live-chat as a subtitle file. As time goes on it may end up becoming the dominant choice.
 
  • Like
Reactions: Sage In All Fields
You might want to use some flags too. This is my favorite, for just automatically downloading at highest quality. I just alias it.

alias you-dl="youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4'"
You can actually put a default youtube-dl config file in your user directory to automatically include certain options instead of messing with aliasing. On Windows that means make a text file at
C:\Users\YOURNAMEHERE\youtube-dl.conf
Mine has the contents
Code:
--embed-subs --all-subs
so that I always grab available subs.
Youtube-dl also automatically does best quality by default, btw. You just need to ensure you have ffmpeg installed and on your PATH.

MKV is just a container format that lets you mix and match weird random codecs so it doesn't work in alot of places. It doesn't always encode MKV it just does it as a last resort.
Part of the reason that it uses MKV is that Youtube's "best video" and "best audio" are in different formats, so in order to actually get the best of both you need a container format like MKV. You can force transcoding to other formats, but you'll lose quality in audio, video, or both.
 
Last edited:
Changelog for the latest release, couldn't find the reason for the ".1" minor re-release of 2020.11.01.1 or anything special in the tarball but the version bump, so it was likely a packaging error that caused them to re-release. Maybe somebody in their IRC channel knows.

Markdown (GitHub flavored):
version 2020.11.01

Core
* [utils] Don't attempt to coerce JS strings to numbers in js_to_json (#26851)
* [downloader/http] Properly handle missing message in SSLError (#26646)
* [downloader/http] Fix access to not yet opened stream in retry

Extractors
* [youtube] Fix JS player URL extraction
* [ytsearch] Fix extraction (#26920)
* [afreecatv] Fix typo (#26970)
* [23video] Relax URL regular expression (#26870)
+ [ustream] Add support for video.ibm.com (#26894)
* [iqiyi] Fix typo (#26884)
+ [expressen] Add support for di.se (#26670)
* [iprima] Improve video id extraction (#26507, #26494)

They chose a different regex modification to fix the problem with the JS not found that everybody is having. The r'assets part retains the old variation and was not changed like it is in yt-dlc, somebody else who loves regex's can figure out what the implications are, if there are any at all. They did add a bsrc section and a jsUrl section.

Python:
                        ASSETS_RE = (
                            r'<script[^>]+\bsrc=("[^"]+")[^>]+\bname=["\']player_ias/base',
                            r'"jsUrl"\s*:\s*("[^"]+")',
                            r'"assets":.+?"js":\s*("[^"]+")')
 
Last edited:
Really surprised at the fact that the RIAA somehow still survives. Where exactly are these parasites leeching funds from? itunes?
 
After collecting my thoughts I'm going to sperg about this new paradigm I'm seeing Google and the RIAA force into the world, and how it might already be too late to stop it.

copyright_bullshit.png

Basically, it seems like Google's scheme is that they've got Javascript code which works out the above URL with some magic 'key' parameters.

Essentially, Google and the RIAA have established this change to the Youtube codebase as a litmus test between the "good faith" archivers and the EVIL BABY MURDERING PIRATES. If a video is flagged by Youtube's automated system as having copyrighted music/audio, the video is also given this extra layer of encryption and becomes harder to work around. Even if it's just 2 seconds of a copyrighted song in an hour long video. Even if the video doesn't have any copyrighted music because it's dependent on an automated system that everyone has known for YEARS is practically non functioning and possibly even malicious.

"So what?" some of you may be asking. "Someone will just make a program that works around it." Except that's just the problem, the extra encryption is a legal trap that if the RIAA finds you working around, they can easily use it to put legal pressure on you. "Why else would you be working around this system specifically designed to prevent making copies of copyrighted music? It's obvious you're encouraging piracy and this software needs to be shut down."

And why stop at just audio? Youtube can attach this encryption to everything that gets flagged by Content ID, and suddenly a good majority of Youtube's library becomes legally questionable to download even if a video doesn't violate any copyright. Meanwhile, sites like streamable and clipconverter have already fallen in line with this policy and most likely won't do shit to fight against it out of fear of being sued up the ass. Even if youtube-dl manages to convince the courts that their software as is does not violate copyright, I don't think they'd be as easily convinced that bypassing encryption meant specifically to prevent piracy of copyrighted works is also legal. Oh, and lets not forget this encryption will ABSOLUTELY be used in bad faith to prevent users from archiving certain things that Youtube has an interest in keeping forgotten.
 
Really surprised at the fact that the RIAA somehow still survives. Where exactly are these parasites leeching funds from? itunes?
Literally every conventionally released piece of music in the entire United States as well as EVERY FUCKING SINGLE PIECE OF BLANK AUDIO MEDIA sold.
 
After collecting my thoughts I'm going to sperg about this new paradigm I'm seeing Google and the RIAA force into the world, and how it might already be too late to stop it.

View attachment 1700167


Essentially, Google and the RIAA have established this change to the Youtube codebase as a litmus test between the "good faith" archivers and the EVIL BABY MURDERING PIRATES. If a video is flagged by Youtube's automated system as having copyrighted music/audio, the video is also given this extra layer of encryption and becomes harder to work around. Even if it's just 2 seconds of a copyrighted song in an hour long video. Even if the video doesn't have any copyrighted music because it's dependent on an automated system that everyone has known for YEARS is practically non functioning and possibly even malicious.

"So what?" some of you may be asking. "Someone will just make a program that works around it." Except that's just the problem, the extra encryption is a legal trap that if the RIAA finds you working around, they can easily use it to put legal pressure on you. "Why else would you be working around this system specifically designed to prevent making copies of copyrighted music? It's obvious you're encouraging piracy and this software needs to be shut down."

And why stop at just audio? Youtube can attach this encryption to everything that gets flagged by Content ID, and suddenly a good majority of Youtube's library becomes legally questionable to download even if a video doesn't violate any copyright. Meanwhile, sites like streamable and clipconverter have already fallen in line with this policy and most likely won't do shit to fight against it out of fear of being sued up the ass. Even if youtube-dl manages to convince the courts that their software as is does not violate copyright, I don't think they'd be as easily convinced that bypassing encryption meant specifically to prevent piracy of copyrighted works is also legal. Oh, and lets not forget this encryption will ABSOLUTELY be used in bad faith to prevent users from archiving certain things that Youtube has an interest in keeping forgotten.
I know this answer sucks but the proper solution is to start shifting away from YouTube.
 
After collecting my thoughts I'm going to sperg about this new paradigm I'm seeing Google and the RIAA force into the world, and how it might already be too late to stop it.

View attachment 1700167


Essentially, Google and the RIAA have established this change to the Youtube codebase as a litmus test between the "good faith" archivers and the EVIL BABY MURDERING PIRATES. If a video is flagged by Youtube's automated system as having copyrighted music/audio, the video is also given this extra layer of encryption and becomes harder to work around. Even if it's just 2 seconds of a copyrighted song in an hour long video. Even if the video doesn't have any copyrighted music because it's dependent on an automated system that everyone has known for YEARS is practically non functioning and possibly even malicious.

"So what?" some of you may be asking. "Someone will just make a program that works around it." Except that's just the problem, the extra encryption is a legal trap that if the RIAA finds you working around, they can easily use it to put legal pressure on you. "Why else would you be working around this system specifically designed to prevent making copies of copyrighted music? It's obvious you're encouraging piracy and this software needs to be shut down."

And why stop at just audio? Youtube can attach this encryption to everything that gets flagged by Content ID, and suddenly a good majority of Youtube's library becomes legally questionable to download even if a video doesn't violate any copyright. Meanwhile, sites like streamable and clipconverter have already fallen in line with this policy and most likely won't do shit to fight against it out of fear of being sued up the ass. Even if youtube-dl manages to convince the courts that their software as is does not violate copyright, I don't think they'd be as easily convinced that bypassing encryption meant specifically to prevent piracy of copyrighted works is also legal. Oh, and lets not forget this encryption will ABSOLUTELY be used in bad faith to prevent users from archiving certain things that Youtube has an interest in keeping forgotten.
Yes, this basically captures it.

I can upload copies of ytdl to multiupload sites and post the links on Twitter and imageboards all day, it won't do anything.

If, at this point, this bullshit argument is not directly challenged by an organization with deep pockets like the EFF, it will take hold in US law. At which point developers can be directly attacked. Any US based organization, like the Debian Foundation, hosting a copy of ytdl can be attacked. Sites using ytdl can be attacked. Scum like Google can add these nonsense 'encryptions' to every video they host.

I am not optimistic. These internet freedom organizations are funded by scum like Google in the first place. They're unlikely to stand up against this nonsense. Nor is Mr. Friedman of GitHub likely to stand up to RIAA, despite the fact that Microsoft would earn tens of millions of dollars equivalent in good publicity, because freedom is bad for Microsoft's business.
 
I am not optimistic. These internet freedom organizations are funded by scum like Google in the first place. They're unlikely to stand up against this nonsense.
Google may be scum but their basic business model with YouTube is threatened by arguments like the RIAA is making and that's why they ended up sued by Viacom, which they arguably won although the settlement requires them to do bullshit like this. So they'll break things like youtube-dl from time to time but it really isn't in their interest to do more than the bare minimum to look like they're trying to stop stuff like this.
 
  • Thunk-Provoking
Reactions: ScatmansWorld
Yes, this basically captures it.

I can upload copies of ytdl to multiupload sites and post the links on Twitter and imageboards all day, it won't do anything.

If, at this point, this bullshit argument is not directly challenged by an organization with deep pockets like the EFF, it will take hold in US law. At which point developers can be directly attacked. Any US based organization, like the Debian Foundation, hosting a copy of ytdl can be attacked. Sites using ytdl can be attacked. Scum like Google can add these nonsense 'encryptions' to every video they host.

I am not optimistic. These internet freedom organizations are funded by scum like Google in the first place. They're unlikely to stand up against this nonsense. Nor is Mr. Friedman of GitHub likely to stand up to RIAA, despite the fact that Microsoft would earn tens of millions of dollars equivalent in good publicity, because freedom is bad for Microsoft's business.
Why don't you just host it in Ayatollah land?
 
Back