GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

The server version is EPYC, Threadripper is just scaled down EPYC for workstations/gaming PCs for idiots.
I stand corrected!
The reason is simple: AMD are much smaller than Intel. Intel can afford to keep separate server/PC lines, while AMD can only really afford to do one or the other. So AMD made a terrific server chip and implemented it as chiplets. You can get a highly efficient server by putting half a dozen chiplets on one processor, or you can get a quite performant and still fairly efficient PC by putting only one or two chiplets on the processor, and pumping a bit more power in instead so they can reach higher clocks.
Yeah… A lot of people don’t realize how small AMD really is. It’s a tiny company compared to some of the others, yet still somehow manages to compete with Nvidia and Intel.

Great example of “work smart, not hard”.

They get to have their GPU development subsidized by Sony and Microsoft, and on the CPU front make a core not just powerful enough to take on Intel in servers, but also efficient enough to trade blows with Qualcomm latest power sipping core.

Just a shame that Intel still does their subsidy bullshit. That’s the only explanation I can see, as to why something like 70% of high end laptops still use Intel power hogs.
 
In an alternative universe where Intel didn't fuck up its own process node roadmap due to chasing quarterly profit & diversity awards, AMD is still winning in the server space simply because of costs. Zen 1 was already exceeding Skylake in core count despite being on a less dense node. Rome was supposed to launch against Sapphire Rapids, and Intel would have been positioning a $10K 56-core CPU against a $7500 64-core CPU. The inroads they've made in desktop & notebook have been gravy, and good on them for taking advantage of every opportunity they got.
 
I’m just hoping desktop zen5X3D will have the extra cache on both CCDs. I really really want extra cache, but not if only half the cores get it.
My understanding is that cores having to access the extra cache across infinity fabric results in latency problems that cause performance to be basically the same as if you were using a vanilla CCD with the accompanying low clock speed.
 
  • Thunk-Provoking
Reactions: seri0us
I’m just hoping desktop zen5X3D will have the extra cache on both CCDs. I really really want extra cache, but not if only half the cores get it.
We already got a leak saying that won't happen. Same configs as last time.

The minimum expected improvement for 9000X3D is that it will supposedly have full overclocking support.

AMD Ryzen 9000X3D series rumored to fully support overclocking, Zen5 gets improved DDR5 memory support
AMD Ryzen 9000X3D “Zen 5” CPUs To Feature Same 3D V-Cache As Ryzen 7000X3D: 9950X & 9900X With 128 MB, 9800X3D With 96 MB L3
AMD's Ryzen 9000 won't beat the previous-gen X3D models in gaming, but they'll be close — improved 3D V-Cache coming, too
"And then when it comes to X3D, and I'll just get around that now, we're super committed to X3D. In fact, we have some really, really cool updates to X3D coming. So we're working on iterating and not just rehashing it," said Woligroski.
 
  • Agree
Reactions: Wooo
For Ryzen, this is probably, sadly, correct. However, AMD recently launched EPYC on AM5! These chips are a lot more likely to get workstation features like that. Honestly for gaming, the 5800X3D is still plenty, and will most likely be for many years. 7800X3D is a generation improvement on that, as will 9800X3D be. There's just no call for more cores on a gaming platform yet, and that's what these processors are targeting. But workstations are different. Some tasks are very memory hungry, and X3D shines in essentially all of them. So the 7900X3D and 7950X3D basically don't make sense. You either run it with base amounts of cache and all cores, or few cores and lots of cache. Good enough for games sure, but people building/buying workstations are willing to pay more, which is what the AM5 EPYC line is all about.
My understanding is that cores having to access the extra cache across infinity fabric results in latency problems that cause performance to be basically the same as if you were using a vanilla CCD with the accompanying low clock speed.
If they can make over a gigabyte (!) of L3 cache work on the 96-core EPYCs, I don't see why they couldn't apply that same technique on AM5. Certainly these processors aren't suffering from cache latency problems. You can set your processor topology such that the L3 cache is contained to a certain group of processors. If you run lscpu -e you can see this for yourself, on an AM5 chip with two CCDs you'll see that each core has its own L1 and L2 caches, and an L3 cache shared between all cores on that CCD. It's not even a matter of adding more L3 groups like on EPYC, I think this is something you'd set with the NUMA nodes. The scheduler should be able to handle not moving threads across CCDs on its own, IIRC this has been the design since 3000-series.
 
If they can make over a gigabyte (!) of L3 cache work on the 96-core EPYCs, I don't see why they couldn't apply that same technique on AM5.

I'm not sure what you're trying to say. L3 cache across CCDs is visible on a 9004X CPU, but it's higher latency. However, the primary use case of the 9004X series is HPC applications where each core has a different process pinned to it with its own memory space, so there is no real need to access L3 cache from another CCD. Moreover, the bandwidth you get is still a lot higher than going to the main RAM. The whole reason AMD went with more L3 cache rather than L2 cache, which is the route Intel had been going, was they were targeting applications where bandwidth, not latency, is the primary concern.
 
I'm not sure what you're trying to say. L3 cache across CCDs is visible on a 9004X CPU, but it's higher latency. However, the primary use case of the 9004X series is HPC applications where each core has a different process pinned to it with its own memory space, so there is no real need to access L3 cache from another CCD. Moreover, the bandwidth you get is still a lot higher than going to the main RAM. The whole reason AMD went with more L3 cache rather than L2 cache, which is the route Intel had been going, was they were targeting applications where bandwidth, not latency, is the primary concern.
I meant that it’s mostly a matter of letting the scheduler know the architecture, so it won’t move threads between CCDs unnecessarily. There’s no good reason they couldn’t put the extra cache on both CCDs but cost savings and market segregation. Even if some inter-die communication is necessary that’s still going to be more performant than fetching from RAM, let alone outright turning off half the cores.

Unfortunately I don’t have a 9004X to test it with, but I would have assumed lscpu would list the caches as belonging to a single die.

Bandwidth vs latency isn’t really what I was getting at, L3 outperforms RAM in both. By a lot even.
 
  • Thunk-Provoking
Reactions: Vecr
For Ryzen, this is probably, sadly, correct. However, AMD recently launched EPYC on AM5! These chips are a lot more likely to get workstation features like that.
I thought that might happen too when I heard about it, but the entire EPYC 4004 Series on AM5 lineup turned out to be carbon copies of existing Ryzen 7000 desktop CPUs with the same 1x cache die on the two 3D V-Cache models. It is trivial to source those compared to the slightly greater effort of making a new AM5 chip with 2x cache dies. The carbon copy exception may be the quad-core 4124P, which could be using an APU die since it has half the L3 cache.

AMD will not be too generous to anyone too cheap to shell out for Threadripper/SPn socket Epyc. They did what they needed to do to embarrass the Intel Xeon E-2400 and called it a day.

With perfect scheduling and a mix of workloads, the 7950X3D/9950X3D is perfect... for a handful of people. I think an 8+16c with 1x 3D cache would make more sense.
 
I meant that it’s mostly a matter of letting the scheduler know the architecture, so it won’t move threads between CCDs unnecessarily. There’s no good reason they couldn’t put the extra cache on both CCDs but cost savings and market segregation. Even if some inter-die communication is necessary that’s still going to be more performant than fetching from RAM, let alone outright turning off half the cores.
I doubt there are a lot of applications where anyone cares to optimize at that level for a high-end consumer CPU. For the customers that are legitimately that demanding of performance, EPYC or Threadripper with up to 12 channels of RAM are the way to go.
 
I've complained before that the 12 GB of RAM in the Radeon 6700 XT never seems to be useful. A recent example is No Man's Sky. With everything except textures on Ultra, it stays under 8 GB and runs at 90 fps. Bump up the textures to ultra, and now it's consuming 11.5 GB and running at 30-40 fps inside the hangar of my capital ship.
This doesn't occur on the 6750XT... right?

By the way, HUB's review of the 9600X has been posted and this comment really sums up the current CPU mindset:
This is all Intel's fault for pushing the narrative that nothing matters except pushing chips as hard as possible and using the most wattage if it means being able to post gains.

AMD comes along, makes more power efficient chips that match or slightly exceed perf from the previous generation, then get backlash for it. Come on.

I can't believe Linus Tech Tips is somehow the only publication that got it right in this round of Zen 5 reviews.
Intel and its' consequences have been a disaster for CPU benchmarking.
 
I doubt there are a lot of applications where anyone cares to optimize at that level for a high-end consumer CPU. For the customers that are legitimately that demanding of performance, EPYC or Threadripper with up to 12 channels of RAM are the way to go.
I knoooow, but I just really want lots of cores with tonnes of cache in my watercooled mITX workstation…
 
  • Feels
Reactions: Another Char Clone
Intel and its' consequences have been a disaster for CPU benchmarking.
It's not an "Intel narrative." Standalone CPUs are purchased heavily by gamers, and the gaming review sites sing a CPU's praises to the moon if it's able to get sub-refresh-rate FPS in 10-year-old games, since new games are virtually never CPU-bound. Not too long ago, there were videos with rageface thumbnails because Rainbow Six Siege couldn't achieve 900 fps on 14th gen Intel without thread pinning or some pointless shit like that. The market wants to see GN and LTT soyface with BEST GAMING CPU EVER headlines, which causes downmarket effects because seeing the latest Ryzen 9/i9 with the soyface approval thumbnail makes you feel a lot better about the i7/i5/Ryzen 7/Ryzen 5 you end up actually buying.
 
I'd say part of what hurt AMD here is going too low, so all the techtubers are sobbing about no performance gain.

If it was, I dunno, 80 watts instead of 65, then that would still use less power, and also have a performance boost.

Or... the techtubers could do an efficiency/watt thing. but that requires integrity. And intellect.
 
I'd say part of what hurt AMD here is going too low, so all the techtubers are sobbing about no performance gain.

If it was, I dunno, 80 watts instead of 65, then that would still use less power, and also have a performance boost.

Or... the techtubers could do an efficiency/watt thing. but that requires integrity. And intellect.
I mostly agree although
A. they kindof have to own the defaults, most customers aren't even going to try touching that
B. some places like GN claim to be re-testing the other CPUs on the chart because games get updated over time
 
I understand the techtubers' views because most people don't care about this level of power consumption in the 65 vs 105 TDP range.

AMD should scrap their TDP method anyway since it's so loosely rated that it's pointless, especially since most models don't come with coolers anymore.
 
Yeah.
I can't help but compare it to Apple, but that was vastly different.

1. The M1 was replacing lower end Intel CPUs in laptops and the Mac Mini, which would benefit from a lower TDP. (Throttling, etc)
2. The M1 didn't just match the Intel CPUs, but annihilated them, while using dramatically lower power.

With AMD, it's both competing against itself, and against Intel (to a lesser extent), and it did it with a pretty powerful processor, which is going to be going into reasonably large desktops, which aren't going to worry about cooling and power draw.

I'd be curious to see how AMD's new offering does in a SFF system, something where the 7950X3D might throttle a bit, losing performance. But that's getting nitpicky.

For AMD, I'd say it's kind of a swing and a miss. The techtubers might have responded nicer to it if it had a performance boost with benchmarks.
Granted, compared to Intel, AMD's got the advantage of 'People aren't actively going out to tell you DON'T BUY IT.'
 
'People aren't actively going out to tell you DON'T BUY IT.'
You'll still get the occasional autist who screams at you to not buy CPU A because CPU B is 0.00001% better in some metric, or 50% better in a metric that is meaningless to you.
 
Back