I always just gave the -j flag the number of threads on the CPU +1. Now, with something like a 32 core Threadripper I'm not sure if j=65 would even show a benefit.
Yeah that's a good general guideline. I also add -l [number of logical cores] seems to work well with making compile jobs in the background not noticeable. There's also still the possibility to give the compile job lower priority to make sure interactive tasks come out on top but I haven't really felt a serious impact in interactivity because of system load in a while to be entirely honest. Since autogrouping it kinda doesn't seem to be a thing anymore.
There's an inherent point of diminishing returns with parallelization of tasks like this, especially since some things have to happen in a specific sequence. It then doesn't matter if you have 2 cores or 200 cores, if A has to happen before B and a singular core is slow in processing A, then 199 cores will just have to wait. Also each additional compile job takes some resources (esp. RAM) and depending on what the machine is doing otherwise, that can actually make things slower, ovehead-wise and especially since Linux still doesn't handle low memory situations gracefully, although there are some out-of-tree kernel patches that improve that bit. (I'm not talking about OOM situations, they're still downright catastrophic for the average linux system without outside help by daemons)
I gotta admit I didn't watch the video and maybe I'm just misunderstanding things but compiling a Linux kernel for a given machine with just the options that machine needs compiled in doesn't and shouldn't take hours. On a reasonably modern processor I'd talk about minutes. (Out of curiosity, I actually compiled my kernel new from scratch in the background on my six core mid-range Zen 2 System while typing the post, it took 4 minutes and 33 seconds complete with building the initramfs image, the majority of that time went to compile amdgpu stuff it seems)
The slowest system I have an active gentoo install on is an Allwinner A20 (2x Cortex-A7) with 2 GB of RAM. Speedwise if I have to wager a very wild guess it's somewhere in the area of a Pentium 3, maybe - a comparison across architecture borders in actual, non-synthetic performance is hard to make. Perfectly feasible although of course it doesn't build packages like firefox or has a high CPU demand nominally. I don't bother with distcc or building binary packages on a faster system (although it's possible across architecture borders) it really does everything by itself. If it actually makes sense, well...
The linux kernel build process usually uses /tmp for temporary files and there's nothing speaking against putting /tmp on a tmpfs since there's no inherent guarantee for files in /tmp and they don't need to survive a reboot. A program that relies on anything in /tmp not being, well temporary, is broken.
TPM is such a cool concept in theory (as long as it isn't buggy or exploitable) so many things you can do with it beyond encrypting the HDD and 2.0 is what Win11 insists the system on having and all that drama about "Win11 compatibility" really was about IIRC. (or at least having TPM at all? Actually not sure) Also so many possibilities to lock yourself out of things forever. Truly the forbidden fruit. (also yeah, securitywise it might be backdoored but so might be the rest of your system so whatever really)