GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

It also depends on what you're doing. Crunching numbers for crunching numbers sakes? Getting new processors is worth it.

Especially on the desktop, these things are only a few hundred bucks. Even for a high end workstation, back before Moore's Law collapsed, we were replacing our $10K workstations every 3 years because the productivity gains were so high. Now we're on more like a 5-6 year schedule.
 
Math isn't actually that single-threaded, at least not at the level our sims work at. Think something like simulating the flow of air through a turbine. If you represent a discrete volume of air as multiple smaller volumes and then step through the motion, you'll probably see what I mean. Each volume depends on its neighbours, but there are a lot of things you can do in parallel still. Engineering workstations still prioritise core count over single core FLOP.
Ok, so it's Sims that are truly taxing, but even they can get split up somewhat. It's why they have tons of cores. But not completely I'm getting. It's why those workstation cores burn so hot. They need both to crunch those kind of numbers. Leaning only on core count or only on single core flop won't work well.
 
Ok, so it's Sims that are truly taxing, but even they can get split up somewhat. It's why they have tons of cores. But not completely I'm getting. It's why those workstation cores burn so hot. They need both to crunch those kind of numbers. Leaning only on core count or only on single core flop won't work well.
The high-end workstation CPUs actually run at lower clock speeds than high-end desktops. The reason is that if you have 2x the cores, you need 2x the power, but if you have 2x the GHz, you need 4x the power, and the applications you care about can take advantage of more cores efficiently.

Here is a good image illustrating what @snov was talking about. This is from an engine cylinder simulation using a free simulation tool called KIVA. Each color represents a different chunk of the simulation that is run on a different CPU core. The way it works is you do a little bit of work independently on each sub-domain, then do an exchange operation on the interfaces to coordinate them, and keep repeating until the work is done. In principle, this method can scale up to as many cores as you have, and can work on any 3D physics...electromagnetics, structures, sprays, whatever. There are simulations like this that have run on over 100,000 CPU cores.

1707924751170.png

To put the computational needs into perspective, an external aerodynamics simulation of a car that captures all the unsteady, buffeting behavior can take around 24 hours on 1024 CPU cores. This is the sort of computer you might want for that:

1707925452865.png
 
Last edited:
The high-end workstation CPUs actually run at lower clock speeds than high-end desktops. The reason is that if you have 2x the cores, you need 2x the power, but if you have 2x the GHz, you need 4x the power, and the applications you care about can take advantage of more cores efficiently.

Here is a good image illustrating what @snov was talking about. This is from an engine cylinder simulation using a free simulation tool called KIVA. Each color represents a different chunk of the simulation that is run on a different CPU core. The way it works is you do a little bit of work independently on each sub-domain, then do an exchange operation on the interfaces to coordinate them, and keep repeating until the work is done. In principle, this method can scale up to as many cores as you have, and can work on any 3D physics...electromagnetics, structures, sprays, whatever. There are simulations like this that have run on over 100,000 CPU cores.

View attachment 5721355
Oh. Oh wow. That's why super computers are so big even today. 100,000 cores working on one simulation is a lot to understate it. Fascinating. So with regular desktop cores though, they burn hotter to compensate I'm guessing?
 
Oh. Oh wow. That's why super computers are so big even today. 100,000 cores working on one simulation is a lot to understate it. Fascinating. So with regular desktop cores though, they burn hotter to compensate I'm guessing?

Regular desktop cores run at high clock speeds because a lot of applications are single threaded. It is easy to understand why this is without knowing anything about computers.

Imagine a fairly complicated task that takes one man a day, such as repairing a car. If you have eight guys work on it, will it take an hour? Of course not. It's the sort of thing that only one guy can do. A smart, skilled guy might get it done in four hours, and a novice might take three days, but really, the only way to get it done faster is to have a better mechanic with better tools. However, if I have 8 cars to fix, one mechanic will take 8 days, while, indeed, 8 mechanics will get the job done in one day. They can work in parallel. There will be some coordination overhead from management. Perhaps there is some sharing of more expensive tools. But overall, it's pretty easy to imagine how to get 8 mechanics to work on 8 cars and get an 8x speedup over having 1 mechanic in the shop.

Since a computer program is, in the end, nothing more than a task for an electronic worker to do, you might imagine, correctly, that some tasks by nature cannot be easily broken up and farmed out to a large number of workers. Such tasks don't benefit from more cores, only from faster, smarter cores. Large-scale physics computations are extremely easy to break up, and the methods for doing so were invented in the 1970s, mostly at NASA.
 
Oh. Oh wow. That's why super computers are so big even today. 100,000 cores working on one simulation is a lot to understate it. Fascinating. So with regular desktop cores though, they burn hotter to compensate I'm guessing?
Somewhat. You can only clock so high before the limits stop you.

Another thing outside of physics sims is rendering CGI. It benefits from faster cores, but it tends to benefit from more cores.
 
  • Informative
Reactions: WelperHelper99
Regular desktop cores run at high clock speeds because a lot of applications are single threaded. It is easy to understand why this is without knowing anything about computers.

Imagine a fairly complicated task that takes one man a day, such as repairing a car. If you have eight guys work on it, will it take an hour? Of course not. It's the sort of thing that only one guy can do. A smart, skilled guy might get it done in four hours, and a novice might take three days, but really, the only way to get it done faster is to have a better mechanic with better tools. However, if I have 8 cars to fix, one mechanic will take 8 days, while, indeed, 8 mechanics will get the job done in one day. They can work in parallel. There will be some coordination overhead from management. Perhaps there is some sharing of more expensive tools. But overall, it's pretty easy to imagine how to get 8 mechanics to work on 8 cars and get an 8x speedup over having 1 mechanic in the shop.

Since a computer program is, in the end, nothing more than a task for an electronic worker to do, you might imagine, correctly, that some tasks by nature cannot be easily broken up and farmed out to a large number of workers. Such tasks don't benefit from more cores, only from faster, smarter cores. Large-scale physics computations are extremely easy to break up, and the methods for doing so were invented in the 1970s, mostly at NASA.
Fair enough. Explains why games and the like have been stubborn going away from single threading; most consumer hardware is set up for it. Especially Intel, which dominates the most popular computer: the laptop.
 
Explains why games and the like have been stubborn going away from single threading; most consumer hardware is set up for it.

The reason game devs dragged their feet on multithreading is that it is difficult to do well. However, clock speeds just aren't getting a lot faster. I got a 12 MHz CPU in 1990. Five years later, so 1995, I got a 100 MHz CPU, almost 10x faster. In 2001, 6 years later, I got a 1.2 GHz CPU, so more than 10x faster That was 23 years ago. If we were keeping that pace going, I should have a 1000 GHz CPU in my computer. At this point, you have to be at least good enough with parallel code to hit 60 fps with an 8-core, 3.5 GHz CPU. That's a pretty easy target to hit, really.
 
The reason game devs dragged their feet on multithreading is that it is difficult to do well. However, clock speeds just aren't getting a lot faster. I got a 12 MHz CPU in 1990. Five years later, so 1995, I got a 100 MHz CPU, almost 10x faster. In 2001, 6 years later, I got a 1.2 GHz CPU, so more than 10x faster That was 23 years ago. If we were keeping that pace going, I should have a 1000 GHz CPU in my computer. At this point, you have to be at least good enough with parallel code to hit 60 fps with an 8-core, 3.5 GHz CPU. That's a pretty easy target to hit, really.
Yes, but IPC is decoupled from frequency. While frequencies aren't improving much right now, actual performance still is. The FX8350 was hitting 8.5GHz a decade ago, but that didn't mean it wasn't garbage.

Of course Moore's Law is still dead dead, but going by frequency alone isn't very useful.
 
Yes, but IPC is decoupled from frequency. While frequencies aren't improving much right now, actual performance still is. The FX8350 was hitting 8.5GHz a decade ago, but that didn't mean it wasn't garbage.

Well, we've always had IPC increases. My 100 MHz Pentium could do more per cycle than my 12 MHz 286 could for sure - the 286 could not even process floating-point numbers in hardware! So in terms of real-world compute performance, it was well beyond 10x faster, depending on workload. By contrast, my i9-12900's single-core performance is only about 2x higher than my old i7-7700 on typical benchmarks.

Also, note that theoretical limits remain a simple function of clock speed. Newer CPUs are much better at staying close to those theoretical limits than older ones, hence the IPC uplift you tend to get from gen to gen.
 
  • Informative
Reactions: WelperHelper99
Well, we've always had IPC increases.
weren't the original P4s a lot slower per clock than the PIIIs? the FX series was definitely slower per clock than the phenom IIs and there is no way that intel hasn't regressed a bit in recent years with all the hardware patches covering up the speed holes
 
Sad day for my computer kiwi bros. My x570 based desktop that I've been using since late 2019 has confirmed memory errors. I found it out when large steam games > 8gb would fail to install and show a corrupt update file error. Memtest86 confirmed many failed addresses in my gskill 3600mhz sticks. Not too thrilled about its short demise.
G.Skill RAM has a limited lifetime warranty
 
weren't the original P4s a lot slower per clock than the PIIIs?

Yes, but that was because of a fundamental design flaw. P4 had a 31-stage pipeline. The problem is a mispredicted branch causes the pipeline to flush and start over, and 31 stages is just way too many to have to redo. Athlon 64 had a 12-stage pipeline. Golden Cove, one of Intel's more recent architectures, has a 17-stage pipeline. So that should tell you something about how insane 31 stages was back then.

IPC is an experimental measurement, so while a P4 does very good with 64-bit code with little to no branching and lots of vectorizable math, a lot of real-world code doesn't look like that, and it shit the bed out in the real world.
 
Last edited:
Pentium fucking 4. I suddenly feel old. I think I can still dig that out of the old shed. I remember those days - you could still rely on your computer to seize up every now and then.
 
  • Like
Reactions: Brain Problems
weren't the original P4s a lot slower per clock than the PIIIs? the FX series was definitely slower per clock than the phenom IIs and there is no way that intel hasn't regressed a bit in recent years with all the hardware patches covering up the speed holes

Dual P3 outperformed the early P4 to such a degree that Intel swore it would never allow for another consumerish dual CPU motherboard ever again and still haven't in 25 years. I kept using my dual P3 1ghz desktop machines until it was embarrassing to admit to. Eventually had to switch because software was being tailored to assume you are using a P4 and the presence of SMP was due to hyperthreading features which it was not.

The P4 launch 1.4 ghz was shockingly bad, specifically. Highly desirable in retro computing now I managed to score one from a dead guy having his stuff tossed by his shitty family. Came with a very strange Intel motherboard too dual channel 266mhz DDR capable but could also run PC133 SD and underclock as low as PC100. Stable as fuck with Windows 98 and drivers. But, for the "new era" it was released in, this was a shockingly unacceptable smoldering heap of hot fucking garbage.

Still got a phenom 2 quad that was my happy place for a super long time. Runs XP, XP-64bit all the way to Windows 10 LTSB and anecdotally is actually snappier than the FX series on DDR3 if you are running Windows 7. Still use a few FX-8350s to this day with linux machines I don't know why it became trendy to shit all over the platform the blowout prices were unbeatable. I believe it was Linus and his merry band of soyfags that started the trend.
 
The reason game devs dragged their feet on multithreading is that it is difficult to do well. However, clock speeds just aren't getting a lot faster. I got a 12 MHz CPU in 1990. Five years later, so 1995, I got a 100 MHz CPU, almost 10x faster. In 2001, 6 years later, I got a 1.2 GHz CPU, so more than 10x faster That was 23 years ago. If we were keeping that pace going, I should have a 1000 GHz CPU in my computer. At this point, you have to be at least good enough with parallel code to hit 60 fps with an 8-core, 3.5 GHz CPU. That's a pretty easy target to hit, really.

Yes, but IPC is decoupled from frequency. While frequencies aren't improving much right now, actual performance still is. The FX8350 was hitting 8.5GHz a decade ago, but that didn't mean it wasn't garbage.

Of course Moore's Law is still dead dead, but going by frequency alone isn't very useful.

Well, we've always had IPC increases. My 100 MHz Pentium could do more per cycle than my 12 MHz 286 could for sure - the 286 could not even process floating-point numbers in hardware! So in terms of real-world compute performance, it was well beyond 10x faster, depending on workload. By contrast, my i9-12900's single-core performance is only about 2x higher than my old i7-7700 on typical benchmarks.

Also, note that theoretical limits remain a simple function of clock speed. Newer CPUs are much better at staying close to those theoretical limits than older ones, hence the IPC uplift you tend to get from gen to gen.
This point is always interesting to me. Despite the speed increases ( or lack thereof), the ability of the cores to do work generally has increased. I remember hearing about floating point numbers first when reading about the OG Xbox. What are they generally and what makes them difficult for a core to process?
 
Has any of you had issues with POST on Nvidia using DP cable? Such a bullshit issue, I took my PC apart couple times and thought that my CPU got fried, then I plugged in the monitor to iGPU and it booted right up. First time I had such problem, who would have thought that display cable can cause BIOS to fail to POST.
I tested a couple different cables and I had the least amount of trouble when using HDMI, but my monitor only supports GSYNC via DP, so that's not a solution.
What I get is 5 beeps (Asrock B650I Lightning with AMI UEFI) and then just fans going up and down. Google suggests disabling FullHD bios, and so far it worked.
As a followup, i dropped memory frequency to 6000MT CL32 and switched to gear 1, it was running at 6400MT CL32 in gear 2 before but maybe it's too much for my 7800X3D (2x48GB so probably pushing it). I will try to tighten timings down the line, since it's a 6400MT CL32 kit.
Maybe it's just motherboard being retarded, why is it refusing to boot when it's headless? Asrock's retardation? I also flashed latest beta bios. Too many variables at once but something may work.
 
This point is always interesting to me. Despite the speed increases ( or lack thereof), the ability of the cores to do work generally has increased. I remember hearing about floating point numbers first when reading about the OG Xbox. What are they generally and what makes them difficult for a core to process?
You have integers, then when you have a decimal point, in say a fixed point number where the decimal is always in the same place. Floating point numbers on a extremely general level is what it sounds like, the decimal point is not fixed, it floats around. It allows it to express a large range of both very small and very large numbers, but at the cost of accuracy and computation (it is much more complex to compute say addition when you have to take into account the floating point). Without any hardware optimizations, I recall some older CPUs like the Z80 taking 100 times as many cycles for a floating point operation compared to an integer operation.
 
  • Informative
Reactions: Brain Problems
You have integers, then when you have a decimal point, in say a fixed point number where the decimal is always in the same place. Floating point numbers on a extremely general level is what it sounds like, the decimal point is not fixed, it floats around. It allows it to express a large range of both very small and very large numbers, but at the cost of accuracy and computation (it is much more complex to compute say addition when you have to take into account the floating point). Without any hardware optimizations, I recall some older CPUs like the Z80 taking 100 times as many cycles for a floating point operation compared to an integer operation.
I'm guessing this is good for stuff like 3d objects, considering all the weirdness that's going on?
 
Back