I've rolled my eyes in the past at some of the sperging that has come out of the Free Software Foundation, but I think I just might support the FSF suing to block this project.
this is a super interesting issue! but jesus that guy's voice made me want to kill myself then others. here is the link to him talking about it
https://github.com/chardet/chardet/issues/327
The guy who did this 7.0 version wants to free this program from its LGPL shackles by rewriting the code from scratch. It raises a bunch of interrelated legal issues at once so not surprisingly the github thread is getting it wrong in all sorts of ways.
code is protected by copyright. copyright is narrow - it's the exact expression of the idea that is copyrighted. the issue is inherently complicated because with the way copyright works, it's the author of the programming language itself that should have an exclusive copyright. putting third-party libraries etc. aside, any program is an assembly of functions pre-defined by the language's author. there's only so many ways you can assemble the legos, so it would be weird to give whoever first put the legos together a certain way a blocking right that kept any other users of the programming language from using the same function in the same way.
to address that issue the law says purely "functional" code is not copyrightable. also, ideas are not copyrightable, so you can't copyright a feature like a web checkout flow. separately, there's a long-standing "independent creation" affirmative defense to copyright infringement where if the author can show evidence they didn't copy the other work and came up with it themselves, it doesn't matter that the words are literally the same as the original work. this is where the notion of "clean room" development comes in. one team reviews the competitor's program, and derives specifications for what a competing product would have to do. a second team, armed only with those specifications, then generates their own solution. because the specifications are abstract ideas, they aren't copyrightable. so the company can prove the second team didn't have access to the competing product's source code, and therefore anything they came up with was independently created. secondarily, it lets you argue any identical code is therefore functional b/c an independent team came up with the same approach.
all of this breaks down when source code is on github b/c how do you weigh the fact that a developer could always secretly be checking github at a public library or something outside the clean room? this happens enough that companies use blackduck or similar to be sure lazy dev employees didn't steal code from an open source project and stick it in without telling anyone. the recent big litigation over this issue was
Google v. Oracle

holding
i break this down b/c it gets called out in the github thread. this is pretty wild - google got to use Java APIs for Android not b/c it had to in order to interface with java, but to give programmers an interface they already knew. oracle and google could have worked out a licensing deal for the code.
Google made two arguments - APIs aren't copyrightable, and if they are, their use of Java SE's APIs in AndroidOS was fair use.
The court declined to decide whether APIs are copyrightable b/c it didn't want to create big unintended consequences for software development: "Given the rapidly changing technological, economic, and business-related circumstances, we believe we should not answer more than is necessary to resolve the parties’ dispute. We shall assume, but purely for argument’s sake, that the entire Sun Java API falls within the definition of that which can be copyrighted. "
The court ruled that to the extent the APIs were copyrightable, Google's use of them (wholesale copying, for no justification other than APIs are intuitive user interfaces, a tiny portion of the code, didn't compete with Java SE) was fair use. The majority opinion cast a lot of skepticism on APIs being copyrightable. In explaining the breakdown between functional and protected code, the courtn noted "In our view, for the reasons just described, the declaring code is, if copyrightable at all, further than are most computer programs (such as the implementing code) from the core of copyright. That fact diminishes the fear, expressed by both the dissent and the Federal Circuit, that application of “fair use” here would seriously undermine the general copyright protection that Congress provided for computer programs."
Back to Cherdet:
the LLM thing is a bit of a red herring. The goal here was to rewrite an existing software project without using any of the GPL-copyrighted code. Whether it was this guy Dan doing it, or him using an LLM to do it, they had to reference the existing code to be sure they were rewriting things so as not to literally copy the code on the page. It means there is no "independent creation" defense, but if Dan only replicates functional code snippets in the rewrite, then he isn't infringing the GPL-licensed code to begin with since that code either isn't copyrightable or its use is fair use.
you could try to make an argument that the LLM ingesting the code is copying it in a way that violates copyright law. (1) LGPL doesn't prohibit private use of code like this, it only prohibits its distribution. so Dan is within the bounds of the LGPL license if he used it privately to design software with the same functionality without infringing any protected LGPL code. and (2) even looking at naked copyrigh principles, that copy is only being used for it to ensure it is
not creating infringing code. closest case i can think of on this is
perfect 10 which would mean this is fair use too.
but it's still a red herring b/c ok - say i'm wrong on both points and the use of the LLM is the problem. Then Dan just rewrites it by hand while looking at the github page, because then there's no technical copying happening.
from the comments, it might be that the LLM fucked up and copied large swathes of code wholesale. if true, that's an issue and dan made a mistake by not manually reviewing every line of code and rewriting heavily copied sections, and documenting anything he kept as functional and why. that's what "clean room" development would look like in this context.
The github thread:
This guy misreads everything by citing the federal circuit appellate court opinion, which overruled the district court to hold the APIs were copyrightable and it was not fair use. The federal circuit is a specialized court that only hears IP appeals and court of federal claims appeals, and the specialization has made them over-complicate IP law and come down too far in favor of rightsholders. The result is the supreme court has kept reversing their attempts to expand IP law beyond its natural borders, Google v. Oracle being one of those cases.

This Norwood master pointed out he was wrong

Where it gets aggravating again is how many people liked his uninformed response. the usual reddit thing where if you act short, snotty and authoritative people think you're right b/c that's how neckbeards act
It's hard to tell if he's just being an arrogant engineer who thinks he doesn't need context to understand the interplay between the Federal Circuit and SCOTUS, or he's saying all of this in bad faith. But either way, this is totally wrong - SCOTUS' approach to reversing the Federal Circuit was a complete slapdown. They won't make sweeping rulings about API copyrightability unless they absolutely have to, but with Roberts, Gorsuch, Kavanaugh, Kagan, Sotomayor, Jackson on board there's still a 6-3 coalition that is skeptical of copyrighting APIs.
An unexpected interesting point - someone did point out how much the assumptions of GPL fail to apply in a post-github world and that's part of why GPL hasn't succeeded in its goal to virally seize the means of software production.
as they point out, companies always had the right to rewrite software like this. GPL had leverage b/c it was too expensive and difficult for companies to build their own alternative. software development is getting both simpler and more cheap. AI and IDEs are part of this but so are really cheap very good developers in latam and eastern europe who are 10x better than indians. this is freaking people out because it's removing the non-legal friction that gave GPL a moat. The same is true of proprietary software which is what's prompting a mini-wave of doomerism among SaaS companies. In that sense this parallels what happened with music and video piracy. copying tapes was at-home piracy most almost everyone did at some point, but it was time consuming and relatively expensive due to the cost of the media and quality suffered. the result is you had mixtape/remix culture but not to the extent it cannibalized the music industry. napster turned all of that on its head by removing the friction to copying. if vibecoding keeps improving this issue could be on the same trajectory. the big question for us of course is what vibecoding actually costs when the technology has matured and the era of subsidized tokens is over - comparative advantage should mean that it's still cheaper to get a product from salesforce than it is to kluge your own vibecoded thing together.