I used to post on 4chan many years ago (stopped regularly visiting there about a decade ago), and has lurked here occasionally for a number of years, here for the popcorn/debate. I visited a smattering of boards, but mainly the tech board. I eventually stopped as the quality of conversation got worse (maybe it was never good) and I got tired of arguing with internet strangers (hence, the username here.)
(Boring nerd shit coming).
I don't think it's a surprise to anyone that 4chan's code was not of the highest quality. I think it was shocking to see that at least from a backend software perspective (FreeBSD, PHP versions at a bare minimum) weren't updated for a decade and went EOL almost a decade ago. PHP 5.6 versions have over 100 CVE (vulnerability listings)... by all accounts, no regular care and feeding (updates) were done on the backend software. The end vulnerability as I've read it is that some boards permitted PDF upload, but the library they had was so old/vulnerable you could give it postscript with a fake PDF extension and then execute arbitrary code (whatever malicious code you want).
Now updating the board software itself, I'm really not that surprised - it's not like there's a ton of features added to 4chan in the last decade. And when the board software had some rewrites to improve HTML output and create a read only JSON API to reduce data consumption/server load, that had a dev domain with a developer capcode. From what I understand, that wasn't moot - that was moot bringing in Max Goldberg (of YTMND). He had a test.4chan.org domain many many years ago and that went down nowhere before it came back and the JSON API came about circa 2012. Leaks of the current source code show that it is pretty dogshit in quality with really bad practices (hardcoded credentials being just one of many examples) and that it used deprecated PHP calls to MySQL that would have gone from deprecated (shouldn't be used but still works) to unsupported (it would have broken the site had they updated the PHP version).
Okay, so boring nerd shit aside, tl;dr, what does this mean? In short, the site returning anytime soon is dubious.
- If they update the site to remove the hole that the hacker used to get in (patch the PDF library only), the site is now known to have multiple end of life versions of software with vulnerability. The chance that someone would hack their way in is 100%.
- If they update the backend software (PDF library, PHP, MySQL, FreeBSD OS) without updating the site software, at least for PHP, the site will break.
- If they update the backend software and then perform bare minimum updates to get the site code working on the new PHP version (more probable since there seems to be a large amount of community interest that didn't previously exist), the source code for the site has leaked, and it is pretty trash. The likelihood that someone finds an exploit is pretty high.
- Taking the existing site code and taking a larger overhaul of it to screen it for security/change it to be more resilient from a security perspective is not trivial and will take more time. It still bears risk of some autist going through the code and stashing a vuln aside.
- If they go and implement a newer board software that isn't as much of an outdated mess as Yotsuba was on 4chan then that will take time, will have compatibility issues with the current DB, etc... however, people on other sites (such as refugees on other sites trying to "fix" the code) say that despite all its ills, 4chan's code is a dumpster fire, but highly performant, which could be a scalability issue. Also, all the customization (bot fingerprinting, ban system, etc.) would have to be redone.
A few other things:
- From what I've read, everything they did was tested in production, the leak has files where there are files in the vein of whatever.php and whatever-test.php. This is probably why some updates to the site software were made but the backend was untouched (no non-production/test environment to see if your changes were going to stop the site or grind it to a halt).
- Once an environment is compromised that badly, you just flatten it and rebuild it, you don't trust the compromised environment, meaning they should reinstall everything on the box. Setting up from scratch can take time, but making sure the box was clean would be more arduous and time consuming.
- How thick or thin of skins the doxxed janitors/mods have remains to be seen. Hard to run a site without mods if you want advertisers (and Hiroyuki is in it for money, so that matters.)
My guess? 4chan returns in some form, minimum time two weeks. If it's less than that, there's a very high chance they're using dogshit from the old site in the relaunch that will get exploited. If they take too long though, there's a large risk that a lot of users start to move on.
The biggest wildcard is if they realize how much work it's going to be to relaunch the site, whether there's anybody they trust for free/cheap to be involved, and that if it's too much effort, Hiroyuki may just pull the plug and look to sell the site and wash his hands of it, and it doesn't come back.
Time will tell.