🐱 Inside the lawsuit that could upend the Internet Archive as you know it

CatParty

In the early days of the pandemic, as physical libraries, schools, universities, and bookstores closed—and people were restricted from leaving their homes with very few exceptions—long waiting lists developed to access popular ebooks at public libraries.

To alleviate that problem, the Internet Archive launched a short-term project. Dubbed the National Emergency Library, it allowed anyone who signed up—for free—to their website to borrow digital copies of 1.4 million books in their possession without a waiting list. Most of these materials were 20th-century books that the Internet Archive had previously digitized to make up for the lack of commercially available ebook versions.

The National Emergency Library was part of the Open Libraries initiative—a web-accessible public library containing the full texts of over 1.6 million public domain books as well as over 647,000 books not in the public domain.

In the Internet Archive’s announcement, published on March 24, 2020, digital librarian Brewster Kahle said that allowing anyone to borrow these 1.4 million books without a waiting list in a time of crisis “was our dream for the original Internet coming to life: the Library at everyone’s fingertips”.

Two years later, though, the Internet Archive’s dream is playing out as a legal nightmare.

Following public criticism from several writers, and accusations of “acting as a piracy site” by the Authors Guild, a group of major publishing houses sued the Internet Archive in summer 2020.

Hachette, Penguin Random House, John Wiley, and HarperCollins claimed that the Internet Archive’s Open Libraries initiative acted as an “unlicensed aggregator and pirate site” that operated to the detriment of authors and publishers.

Now, both the Internet Archive and the publishers are hoping to settle the matter without needing a full trial: both parties requested a pre-motion conference on a motion for summary judgment—meaning a federal judge will rule on the suit instead of a jury.

As both the Internet Archive and the publishers prepare for the next steps in the legal proceedings, the heavily publicized lawsuit is renewing worries about the significance and the future of the archive.

The matter seems far from resolved both from a legal and an ethical point of view: the applicability of copyright law in an age of infinite digital replicability itself is an enormous and ongoing point of contention that can’t be solved by a single lawsuit.

In this suit, the Internet Archive—represented by the Electronic Frontier Foundation (EFF)—argued that its Open Libraries initiative is basically equivalent to traditional library lending thanks to what is known as Controlled Digital Lending. According to this argument, the Internet Archive has been making digital copies of books that it physically owns, but only lending out the digital file to one user at a time, essentially replicating the experience of physical libraries only loaning a book to one person.

“The Internet Archive and the hundreds of libraries and archives that support it are not pirates or thieves,” EFF Legal Director Corynne McSherry recently stated. “They are librarians, striving to serve their patrons online just as they have done for centuries in the brick-and-mortar world. Copyright law does not stand in the way of a library’s right to lend its books to its patrons, one at a time.”

The National Emergency Library seemingly circumvented the Controlled Digital Lending system for a few months, allowing students, academics, and everyone else to borrow up to five digitized books or ebooks for a two-week period. As opposed to how the Controlled Digital Lending system usually works, during this emergency period, the Internet Archive allowed people to access the same digital copy of a text at the same time.

Some authors and publishers, regardless of the benevolent act at a time of a national emergency, believe that the Controlled Digital Lending theory fundamentally misinterpreted copyright law. As the Authors Guild stated, “There is simply no basis in the law for scanning and making copies of entire books available to the public.”

This dispute has played out for years: In 2019, the British Society of Authors threatened to sue the Internet Archive unless it stopped the alleged unauthorized lending of digitized books. The launching of the National Emergency Library at a time when most libraries were closed frustrated many authors, who worried their income could be further endangered at a time of particular precarity.

The Internet Archive maintains that its work does not actually harm writers or publishers.

“In a copyright lawsuit against a practice that has continued for years, one would expect the copyright holder to be able to point to some metric showing that the defendant’s conduct has harmed them,” the recent motion seeking summary judgment, presented in early July, reads. “Plaintiffs have failed to quantify any market harm from CDL. And there’s a good reason: because the lending, licensing, and sales data demonstrate that no such harm has occurred or is likely to occur.”

Still, the plaintiffs are asking for the Internet Archive to repay financial damages for 127 copyrighted titles present in the Open Libraries: according to an estimate by Vox, if the publishers win they could receive up to $19 million dollars in damages—equivalent to one year of the Archive’s operating revenue.

Although the lawsuit clarifies that the publishers don’t want the rest of the Internet Archive to close, it asks for a preliminary and permanent injunction of the Internet Archive’s digitization and lending processes, claiming that the fact that the Internet Archive offers more than 33,000 of their copyrighted works for free download means the digital library is unfairly competing with their authorized ebooks. A permanent injunction would leave the digital library depleted and risk future efforts by the Internet Archive

According to the EFF’s McSherry, the stakes far surpass the Internet Archive. “The publishers are not seeking protection from harm to their existing rights. They are seeking a new right foreign to American copyright law: the right to control how libraries may lend the books they own,” he stated.

“Beyond the monetary damages, the publishers are asking for the destruction of 1.4 million books, many of which do not exist in digital form anywhere else. That would be a real tragedy for people who depend on us for access to information,” Internet Archive founder Brewster Kahle told Vox in 2020.

Neither the Internet Archive nor the plaintiff’s attorneys replied to The Daily Dot’s request for comment.

I am ride or die for the internet archive. It’s like the only good website left and it must be guarded by its bravest posters.
— chris person (@Papapishu) July 12, 2022
Now, as the summary judgment process plays out, many users are showing their support for the Internet Archive, highlighting the revelatory work the non-profit organization has been doing since it started in 1996.

Its biggest feat by far is the maintenance of the Wayback Machine, an invaluable collection of over 390 billion web pages that represents the most thorough archive of internet history and culture in the world.

Most of the posts stress the Internet Archive’s importance in their pursuit of knowledge and information.

“The data loss if the Internet Archive went down would be comparable to the Library of Alexandria,” one user said.

“I have access to a major university library, and all its databases. The Internet Archive library is a better tool than most discovery systems. It can search *inside* ALL its books. This helps me find useful sources at my libraries, at other libraries, and books to PURCHASE,” another wrote.

The Archive’s importance for academic research was also stressed in a tweet stating that: “Without Internet Archive, much of my research would not be possible. Most of what I access are out-of-print books from the nineteenth century.”

Considering that the print copies of these books are usually incredibly hard to access, let alone borrow, having a free, digitized copy at a click’s distance can speed up the research process significantly—and make knowledge more attainable for people who don’t work in academia.

The Mosul Eye, a news blog that documented the Islamic State’s occupation of the Iraqi city of Mosul, said that “In many cities like Mosul, where our libraries have been destroyed by terrorism, websites like the Internet Archive have become our only window to knowledge”.

Not everyone can acess the average library and absolutely not everyone has the means to buy books.
This is why the internet archive is important! It's a library for anyone in the world as long as they have an internet connection! It is as I said a humanitarian service!
— Signe (@Kaijumara) July 9, 2022
Some are also underlining how the whole lawsuit can be interpreted as a dispute on the nature of libraries itself, since the plaintiffs are essentially going after the Controlled Digital Lending system as an alternative to the more conventional approach to buying ebooks that brick-and-mortar libraries mostly employ.

“This isn’t authors vs piracy. It’s about the possibility of forging a pathway, in our existing, awful, law, for libraries to function like they’re supposed to. Every copy of every book a library owns could be checked out as an ebook. Want a rare book? If any library has a copy, they can loan your library theirs instantly through interlibrary loan. The Internet Archive has already digitized it, and all the other library has to do is set their copy aside,” @cozyunoist wrote.

With the parties’ request to proceed with summary judgment being granted, a ruling is expected sometime toward the end of 2022 and beginning of 2023.
 
Why does the INTERNET archive even need the fake library thing? Nuke it and you avoid the lawsuit altogether. Have someone else fight this retard. It's just not worth sacrificing the already delicate web archive and extensive archives of software, microfilm, newspapers, etc. It's already bad enough the site needs a section for people to request removal of archived websites and the fact that robots.txt can be used by webmasters to block entire sites from being saved by it at all.
 
The best part of the Internet Archive is the Wayback Machine. As long as that remains intact, I couldn't care less about the rest of it. The Wayback Machine is priceless. I have been able to dig files out of it that haven't been available in 25 years.

However, if the publishwhores manage to win this one, it will basically ensure that I will absolutely never buy another book, if I have any other choice, ever again. They really need to think about the negative value of the hate this will generate over the very small potential profits that may have lost.
That and the numerous archives of old software, old manuals and user guides that don’t really exist anywhere else. Not as useful as the wayback machine, but an important part of preservation in general. It’s a generally reliable place to host scans of rare documents that should be preserved. Some of their shit regarding takedowns I’m not a fan of, but given how quickly people go after them for shit like copyright infringement I understand it.
 
It's not going to be the end of the world if it gets shut down, I pirated before it and I'll pirate after it.
If they take down the site, including the wayback machine, it would be the modern equivalent of the burning of the library of alexandria. Decades of web history wiped out in the name of profit. Hopefully that won't happen.

We need an Internet Archive Archive.

I agree. I think its fucking stupid that the IA is located in San Fran, an area with a history of natural disasters. One bad earthquake could destroy all their data. Put it in Greenland, the mountain side of Colorado, or Iowa. Somewhere where the elements won't fuck with it.

Unless Elon Musk or someone with very deep pockets decides to step in, nearly impossible to manually backup. Considering they have fucking Petabytes of data. Not something the average Data Hoarder has on hand.
 
In this suit, the Internet Archive—represented by the Electronic Frontier Foundation (EFF)—argued that its Open Libraries initiative is basically equivalent to traditional library lending thanks to what is known as Controlled Digital Lending. According to this argument, the Internet Archive has been making digital copies of books that it physically owns, but only lending out the digital file to one user at a time, essentially replicating the experience of physical libraries only loaning a book to one person.
To my understanding, this digital book lending concept isn't protected by any existing copyright law, so their argument that they only gave out digital books according to the physical copies they had available holds no water. They fucked up, didn't bother to get permission from anybody and they're going to eat damages. Sorry, them's the breaks.
 
random accounts on Internet Archive get popped all the fucking time, if publisher was really that butthurt they could just report it
I've had posts of my own get nuked off there, content that was essentially abandoned by the studios that created it. As in, no way to actually access it legitimately. But they still had in in their content identification systems and those still automatically flagged the content. It's easy as hell for people to get stuff removed from IA, they don't even have to prove that they own it, same as Youtube.
 
Why does the INTERNET archive even need the fake library thing? Nuke it and you avoid the lawsuit altogether. Have someone else fight this retard. It's just not worth sacrificing the already delicate web archive and extensive archives of software, microfilm, newspapers, etc. It's already bad enough the site needs a section for people to request removal of archived websites and the fact that robots.txt can be used by webmasters to block entire sites from being saved by it at all.
They did nuke it, and the publishers refused to let up.
I've had posts of my own get nuked off there, content that was essentially abandoned by the studios that created it. As in, no way to actually access it legitimately. But they still had in in their content identification systems and those still automatically flagged the content. It's easy as hell for people to get stuff removed from IA, they don't even have to prove that they own it, same as Youtube.
This is true and one of the weakest parts of IA but I doubt they have the resources like Google to actually have an automated system to verify copyright and process takedown requests at near the same level.
 
This is true and one of the weakest parts of IA but I doubt they have the resources like Google to actually have an automated system to verify copyright and process takedown requests at near the same level.
I don't think THEY do, I believe that at this point several publishers have their own internet crawling software or rent it from some other company that they use for the express purpose of detecting random mentions of copywrited material to C&D. Things have gotten pretty bad over the years for anyone using material from major IPs even if it's just in the written word. It doesn't matter what site.
 
The idea of "lending" this stuff is gay and too metaphysical for me. Just pirate stuff, let Archive.org stick to rock shows and waybacking a more innocent time.
They did nuke it, and the publishers refused to let up.
AFAIK they basically have to, at least once they get started, and that's one of the more pernicious things about these laws. I'm not a lawyer but basically if you do not enforce your claims you can lose them.
 
The idea of "lending" this stuff is gay and too metaphysical for me. Just pirate stuff, let Archive.org stick to rock shows and waybacking a more innocent time.

AFAIK they basically have to, at least once they get started, and that's one of the more pernicious things about these laws. I'm not a lawyer but basically if you do not enforce your claims you can lose them.
I'd love to see receipts for this claim. I've heard it parroted a lot, but even with how bullshit US copyright is and how bad it's made the rest of the west's IP law, I would bet the amount of times anyones lost a case for their IP because they didn't sue everyone and everything can be counted on one hand. It's not like the companies that actually do the most of this don't have enough fuck you money to prevent the loss of their material.
 
  • Thunk-Provoking
Reactions: IAmNotAlpharius
Time for another PSA:


Why didn't you use the better one....

Also the left has been after Archive sites for a while now. I remember when archiving news articles got labled as "right wing" a few years ago.

TPTB really, really, hate the idea of somebody being able to dig up proof too easily.
 
Last edited:
I don't think THEY do, I believe that at this point several publishers have their own internet crawling software or rent it from some other company that they use for the express purpose of detecting random mentions of copywrited material to C&D. Things have gotten pretty bad over the years for anyone using material from major IPs even if it's just in the written word. It doesn't matter what site.
Yeah I mean IA lacks the resources to combat that so they take the path of least resistance.
The idea of "lending" this stuff is gay and too metaphysical for me. Just pirate stuff, let Archive.org stick to rock shows and waybacking a more innocent time.

AFAIK they basically have to, at least once they get started, and that's one of the more pernicious things about these laws. I'm not a lawyer but basically if you do not enforce your claims you can lose them.
Fair point but IA wasn't actually letting people pirate the books. They also only lent out as many digital copies as they had physical so it was pretty reasonable. I think publishers are more mad that this could set a precedent with normal libraries and by their powers combined will do a show of force that may hurt millions around the world that never even knew about the book lending program.
 
AFAIK they basically have to, at least once they get started, and that's one of the more pernicious things about these laws. I'm not a lawyer but basically if you do not enforce your claims you can lose them.
That's trademarks. Copyrights can be enforced or not, but you still retain them regardless.
 
My 32nd act as temporary dictator of America (after stripping California of statehood and putting them under military occupation) will be to revoke the copyright acts of 1978 and 1909 and if the Mouse complains...well, I would rather avoid summary executions but I'm not taking them off the table either.
You have my vote.
 
Back