Postmortem September 17th outage and rollback

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
Thanks for sacrificing your Sunday so we can say nigger and faggot, Null. You are my favorite niggerfaggot :feels:


It hasn't been adressed in this thread so far but tor is very broken.
See here (.st link)


Edit: posting and quoting works fine on tor again. all stickers also work again. Thx, Jersh!
 
Last edited:
Here's me right now thinking Liz-Fong-Jones who is a self admitted rapist; who also admitted to scum journalists being part of a collective who have committed to crimes to prevent a lawful US business from operating using tactics which are (I believe) a federal crime.
 
Assuming its a pci-e x8 nvme controller, that likely shit the bed, which would cause all 4 to show dead simultaneously. Sucks because it's not something people just have laying around.
 
  • Informative
Reactions: 820㎌Cap
Throw a coin to your slobbermutt
Great to see that your hardware always seems to have a lifetime of a few years
 
Can't post normally since the post form doesn't load because of borked CSP. Had to use an addon that temporarily disables CSP just to post.
Screenshot_20230917_235606.png
Screenshot_20230917_235447.png
Screenshot_20230917_235523.png

Upd: seems to be fixed now.
 
Last edited:
  • Informative
Reactions: 820㎌Cap
All of the drives at once is bad luck. Glad Null wasn’t so retarded he didn’t do backups.
 
Made an account to clarify things. I'm the dude that helps Null with this stuff when he needs it. No, I don't monitor the server cause it's Null's. But maybe I'll setup some stuff for Null to monitor stuff including drive health.

The storage on the server use ZFS pools. The SATA SSD array (SNEED Pool) bypasses the RAID Controller and is entirely JBOD passthrough.
The 4 x NVMe drives are U.2 drives in the front and are in FEED ZFS Pool. The backplane handles SATA, SAS, and U.2. U.2 has it's own area for those drives and connects to the motherboard with a OCuLink cable to a JNVMe header.
The 4 x 1.6TB WD Ultrastar DC SN620 NVMe U.2 drives disappeared from the server last night. But before that, the kernel reported write errors to one of them.

There are a few reasons this could have happened, from most likely to less likely:
- BIOS/UEFI Firmware stopped communicating with the NVMe drives. This happened with a certain BIOS setting when it was initially setup
- The drives actually died from the workload. Unlikely considering these can handle 1.7 Drive Writes per day. But very feasible. These are 2nd hand enterprise drives
- The backplane/JNVMe headers exploded. Super unlikely

The drives are likely still alive, and the server's firmware probably took a shit.
We need to inspect the server's BIOS settings or possibly even update the firmware. Then we can determine if the drives are toast or useless.
There was no foul play at hand here. At best a firmware bug, at worse, the drives sudoku'd.
 
Back