GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

I’d still rather just save power by parking unused chiplets, than by having some of the cores be less powerful. AMD demonstrated that this can be done, and that it works rather well, with the 7900X3D and the 7950X3D.

You can put more compute power on a die and deliver more compute per watt using small, slow cores than big, fast cores. This is the principle behind GPUs. Big, fast cores are better at executing arbitrary code. So IMO hybrid was the right way to go.

E-core design may be the future

E-core design is the present. It's the future for at least the next two generations (Lunar Lake and Arrow Lake both have E-cores, and so does Zen 5, at least on laptops). IDK where you're getting any information that it doesn't work. Works for me:

1712874576749.png

Alright man, I relent. I was talking out of my ass and spun my personal taste as business analysis. They aren't making the desktop CPUs I want them to make.

I make the same mistake sometimes. Apparently, people are buying Sapphire Rapids. I think it's shit. But I guess it's just shit for what I want to do.
 
So after a long time of back and forth with my ISP about internet problems, I finally managed to get myself escalated to a technician that confirmed it was the router acting up, and that I was pushing it past what it was designed for. I purchased a tplink Archer AX80 (AX6000) router that I'm hoping will solve all my problems.

So far it has proven to be better as it has much faster wifi speeds and can even run three wifi networks at once (main, guest, smart devices) so I can pull out the second router I was using as an access point.
 
  • Like
Reactions: Man at Arms
E-core design is the present. It's the future for at least the next two generations (Lunar Lake and Arrow Lake both have E-cores, and so does Zen 5, at least on laptops). IDK where you're getting any information that it doesn't work. Works for me:

View attachment 5898400
It’s not that they don’t know work, it’s that Incel and Microcock promised new optimizations in Windows 11 to make thread director create mindblowing performance gains, but that didn’t materialize. The benchmarks show Windows 11 is maybe slightly better sometimes.
 
It’s not that they don’t know work, it’s that Incel and Microcock promised new optimizations in Windows 11 to make thread director create mindblowing performance gains, but that didn’t materialize. The benchmarks show Windows 11 is maybe slightly better sometimes.
Which benchmarks are these? Pure multithreaded compute, such as cinebench, won’t be much different, each core does as much work as it can, just the threads assigned to e-cores do so markedly slower. What you should be looking at is specifically performance and latencies (big latency penalty when threads move to/from e-cores) for processes with few threads, which is where unoptimised schedulers will struggle. A scheduler unaware of e-cores may dump the benchmark thread on one and then just not move it to a p-core until it finishes, which will massively blow performance. This is the issue Windows 10 had in games, specifically, which is what the YouTube reviewers were complaining about, because Tomb Raider and GTA V is about all they use for benchmarking anyway.
 
I still dont understand why hybrid? More cores = more performance for less power? Cool can I buy one with just 500 e-cores for monero mining? Why does my desktop want E-cores at all? Can't I just have a laptop CPU with no 'performance' cores for even more power savings? It doesn't seem compelling at all.
 
I see a bunch of tests designed to saturate the cores where, with two exceptions, either Windows 11 is outperforming or tied with Windows 10. I don't see the catastrophic failure of Thread Director you're hyperventilating about.
Engineering and marketing cost money. They either didn’t need to develop it at all or if they did, they at least could have saved on marketing because no one is tripping over themselves to get E-cores or upgrade to Windows 11. Also, Assle silicon still kicks their ass, and there was no such thing ten years ago. Yes, I know Assle silicon has its own caveats, but they’re winning on appearance. And that goes back to my thesis. The marketing Incel is doing doesn’t seem to align with what’s actually going on with the chips. They’re still delivering a good product, but not because of why they say it is.
 
Also, Assle silicon still kicks their ass, and there was no such thing ten years ago. Yes, I know Assle silicon has its own caveats, but they’re winning on appearance.
No, they’re very much winning in practice, too. Apple Silicon dominates Qualcomm in both performance and power use on mobile, and while the laptops aren’t making huge performance gains any more, battery life is still three times stronger than AMD for about 25% worse performance (Intel still draws crazy levels of power during load, but they are better than AMD at idle, so it more or less evens out for regular use).
 
  • Informative
Reactions: George Lucas
The marketing Incel is doing doesn’t seem to align with what’s actually going on with the chips.

What's the difference between the marketing and the reality? Intel said Thread Director would provide hints to the operating system that would enable it to more efficiently schedule heterogeneous threads and avoid excessive context-switching. You haven't shown that it's failing to do this.

You showed some benchmarks where relatively homogeneous compute loads are saturating the cores, and Windows 11 tends to run up to ~7% faster in that case (although there's one case where it was 40% faster).

because no one is tripping over themselves to get E-cores or upgrade to Windows 11

1712889041823.png

The architecture has been a big win for Intel. No wonder AMD is scrambling to get their answer out the door.
 
  • Thunk-Provoking
Reactions: Betonhaus
What's the difference between the marketing and the reality? Intel said Thread Director would provide hints to the operating system that would enable it to more efficiently schedule heterogeneous threads and avoid excessive context-switching. You haven't shown that it's failing to do this.

You showed some benchmarks where relatively homogeneous compute loads are saturating the cores, and Windows 11 tends to run up to ~7% faster in that case (although there's one case where it was 40% faster).



View attachment 5899001

The architecture has been a big win for Intel. No wonder AMD is scrambling to get their answer out the door.
Is revenue share calculated by profit margin? So if Intel sells 50 CPUS that they manufacture on their own foundries while AMD sells 100 that they paid TSMC to make, that could make the calculations difficult to compare.
 
I think the use cases is a important distinction. With my laptop, I'm opening shit up, browsers, games, word processors, etc. Those different cores suddenly make sense.

A server is doing the same thing, over and over. Having the same core type in that instance makes sense, at least in my mind.
Maybe we'll see something like 8 P-cores + 256 E-cores eventually, so that a handful of single-threaded sensitive tasks can be run on the P-cores alongside the pile of E-cores for massively parallel tasks. I don't know if any servers need that, and no such product has been announced. In the meantime, enjoy Sierra Forest with 288 E-cores.

From the programming perspective, a single 5 GHz CPU with 10 MB of cache is going to beat five 1 GHz CPUs with 2 MB of cache each 100% of the time. However, it will also consume 5x the power and generate 5x the heat.
It could be worse than that depending on the efficiency curves. Which is why we're seeing as bad as +2% performance, +30% power usage (not always) for the 14900KS, from raising clock speeds 2-3%.

Agreed. E cores can screw off. Bonus points for marketing getting away with calling the processors "X amount of cores" by combining the E and P core count.
I still dont understand why hybrid? More cores = more performance for less power? Cool can I buy one with just 500 e-cores for monero mining? Why does my desktop want E-cores at all? Can't I just have a laptop CPU with no 'performance' cores for even more power savings? It doesn't seem compelling at all.
AMD could come in with E-core chiplets and do it better. All they have to do is fit an 8-core Zen 5 chiplet and 16-core Zen 5c chiplet onto AM5. AMD's "E-cores" support 2 threads per core, AVX-512, same IPC, etc.

The 8 fast cores are good enough for games, and 16 additional cores maxing out at 3.5 GHz or whatever would deliver better multi-threading performance than another 8 cores with the same IPC at less than 5.7 GHz. Bonus points if they put 3D V-Cache on the gaming cores, avoiding the scheduling issues seen with the 7950X3D since the X3D cores would have priority from higher clocks.
 
What's the difference between the marketing and the reality? Intel said Thread Director would provide hints to the operating system that would enable it to more efficiently schedule heterogeneous threads and avoid excessive context-switching. You haven't shown that it's failing to do this.

You showed some benchmarks where relatively homogeneous compute loads are saturating the cores, and Windows 11 tends to run up to ~7% faster in that case (although there's one case where it was 40% faster).



View attachment 5899001

The architecture has been a big win for Intel. No wonder AMD is scrambling to get their answer out the door.
Thread Director is a big proprietary blob so I’ll consider it to be doing nothing until proven otherwise, and those small improvements we see do not prove anything. Meanwhile you ignore the benchmarks where Windows 10 still beats Windows 11, and those are mainly the CPU-bound ones!

It’s been a big win because it enables them to advertise that their processors have lots of cores. The performance comes from the fact that these are the hottest chips I think we’ve ever seen. It’s smoke and mirrors. The real story is Incel went for the hottest chips and that’s why they’re beating GayMD. That’s not bad per se, but it certainly isn’t slick, and is downright embarrassing when two ancient chip designers are getting their asses handed to them by a fruit company.
 
We should note that Apple's big.LITTLE layout started with the A10, released in 2016 with the iPhone 7/7Plus.

So, while it wouldn't directly translate to the M series, Apple had at least a 4 year head start with getting program priority between bigger and smaller cores sorted.

And even Apple had it's stumbles, the A10 could only use the big or LITTLEs, not both at the same time. That was fixed with the A11.
 
  • Like
Reactions: The Ghost of Kviv
Is revenue share calculated by profit margin? So if Intel sells 50 CPUS that they manufacture on their own foundries while AMD sells 100 that they paid TSMC to make, that could make the calculations difficult to compare.

Revenue by definition does not have costs taken out. Companies also typically don't publicly report profit at the division level.

Thread Director is a big proprietary blob so I’ll consider it to be doing nothing until proven otherwise

Well, your benchmarks showed up to 40% improvement.

Meanwhile you ignore the benchmarks where Windows 10 still beats Windows 11

I ignore things that look like they're outside the margin of error. Here's a visualization of all the tests they did, making it pretty obvious that the overwhelming advantage is with Windows 11. It didn't look like they did any tests of heterogeneous, unsaturated workloads, which is what Thread Director is really for. I'm curious about the cause of that large gap in PCMark 10.

1712923942026.png


It’s been a big win because it enables them to advertise that their processors have lots of cores. The performance comes from the fact that these are the hottest chips I think we’ve ever seen.

I've done my own benchmarking of Alder Lake and can confirm that a quad of E-cores outperforms a P-core by about 1.4x for parallel workloads.

is downright embarrassing when two ancient chip designers are getting their asses handed to them by a fruit company.

Integrated LPDDR5 was a very smart choice by Apple. They gambled consumers wouldn't really care that they can't upgrade the memory any more, and they were right.
 
Last edited:
  • Like
Reactions: N Space
On the subject of e-cores (though I know it's probably played out at this point): there's a ton of little shit constantly running on most desktop PCs nowadays that doesn't need much performance to run. Shit like Discord or Telegram or even OS processes like updaters or anti-malware scanners or the firewall. That shit doesn't need a full-fat modern CPU core, most of it could comfortably run on a core 2 duo.

You know what it does consume though? A context switch, and high-performance code *hates* context switches. So we park all the boring shit on e-cores and your vidya games get far more uninterrupted time on the good CPU cores.

Microsoft being retarded about OS design is obviously a problem, but that's not really a refutation of the model so much as an indictment of Windows. Linux has had this shit sorted since the mid-2010s.
 
Linux has had this shit sorted since the mid-2010s.

Linux was pretty bad at handling E-cores until a kernel update in I think 2022. The tricky problem is figuring out how much a process really needs to dispatch instructions at a certain rate. There are two technologies involved, Hardware Feedback Interface and Thread Director. Windows 10 supports HFI, while Windows 11 supports HFI + TD.

HFI tracks, as two 8-bit values, how much performance a core still has available and how energy-efficient it currently is relative to a max. If both are at zero, a core is completely tapped out. It's running as packed and hot as possible.

If I have P-Cores and E-Cores, a Performance Capability of 128 tells me each core is half as busy as it could be. It doesn't tell me how busy they are relative to each other.

Thread Director gives you a table of values for each core that tells you how much available efficiency and performance relative to other classes. The table can potentially support up to 256 processor classes, but for now, there are just two. So this lets you know that even though an E-Core is half as busy as it could be, it still can't consume nearly as much work as a P-Core. Same with energy consumption.
 
Qualcomm boasts about upcoming chip, says most PC games ‘should already work’
Qualcomm has been making big claims about its Snapdragon X Elite chip. The company previously said the chip outperforms Apple’s M3 chip and now the company says most Windows games should “just work” on the chip.

According to The Verge, a Qualcomm presentation at the 2024 Game Developers Conference (GDC) detailed how Windows laptops sporting the company’s X Elite chip could run games at nearly full speed via emulation.


Qualcomm engineer Issam Khalil explained during the talk that developers have three options when it comes to games on the X Elite.

First, they can port titles to native ‘ARM64’ to get the best CPU performance and power usage. That’s because Qualcomm’s scheduler can dynamically lower CPU frequency.

The second option is a hybrid where developers can create an ‘ARM64EC’ app where Windows, its libraries, and Qualcomm’s drivers run natively, but the rest of the app is emulated. Khalil says this gives “near-native” performance.

And the final option is to do basically nothing and games should still work using x64 emulation.

Khalil reportedly said that developers shouldn’t need to change game code or assets to get full speed, especially since most games are graphically bottlenecked by the GPU, not the CPU, and GPU performance isn’t affected. Admittedly, there is a slight CPU performance hit when translating or transitioning between x64 and ARM64, but this reportedly only happens the first time a chunk of code is translated.
And on the GPU side, Qualcomm has Adreno GPU drivers for DX11, DX12, Vulkan and OpenCL, with DX9 and up to OpenGL 4.6 via ‘mapping layers.’

Naturally, not everything will work smoothly. Slides from the GDC presentation detailed limitations, such as games using kernel-level anti-cheat drivers. Games using AVX instruction sets also won’t work, though Khalil says developers should use SIMDe to get a head start converting those to NEON code.

It’s also worth noting that Khalil didn’t say how many games work or how many Qualcomm tested. However, he did say the company was testing top games on Steam and in doing so, it’s confident most other titles should work too.

The Verge also pointed out that we don’t know how good Snapdragon X Elite is at gaming anyway, and Qualcomm told the publication it’s seen ARM run a game faster than x86, or get better battery life than x86, but not both.


Still, if Qualcomm can pull off a smooth transition for games from the current x86 standard to ARM, that could go a long way in making the Snapdragon X Elite a more viable option for many people. ARM-based Windows devices have struggled for a while for various reasons, like mediocre performance and lack of software support. But things have improved, and if Snapdragon X Elite is as good as Qualcomm says, there might finally be some viable ARM Windows machines to give Apple’s M-series MacBooks a run for their money.

It likely won’t be much longer before we get to see for ourselves how good the X Elite is. Qualcomm says systems with X Elite are coming this summer, and Microsoft’s rumoured upcoming Surface Pro 10 and Laptop 6 (for consumers, not the recently-unveiled business variants) are set to use the X Elite and are expected to launch in May.
If developers are forced to rework their anticheat engines for this, it may lead to better support on Linux for games with anticheat
 
Last edited:
  • Optimistic
Reactions: Vecr and DavidS877
If developers are forced to rework their anticheat engines for this, it may lead to better support on Linux for games with anticheat
Now I want to see if Oryon-era Qualcomm Windows on ARM ships more units than Steam Deck. Because that was already a thing that spurred *some* anti-cheat support.

Never trust Qualcomm, buy Snapdragon X Elite when it goes on sale if at all.
 
Back