GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

  • 🔧 At about Midnight EST I am going to completely fuck up the site trying to fix something.
President Trump Wanted to Break NVIDIA —Until He Realized Jensen Was The “AI Warlord” and Said It’d Take a Decade to Beat Them, Even With The Greatest Minds Together
1753404639134.webp
I don't know what's more bizarre about this photo: seeing Jensen in a suit and not in a leather jacket, or Jensen with Trump doing his classic dopey "stand and look" stance next to him.
 
Correct me if I'm wrong but AMD built their entire architecture around 3D V-Cache. They've spent the entire lifespan of Ryzen aiming to produce the design that ended up being the 9800X3D.
I don't think it was like their guiding star, but an option that eventually became available because it was new on TSMC's menu of packaging services. I don't think they had it in mind when designing Zen 1.

3D V-Cache was optional and modular. Zen 2 had the TSV connector pads needed to attach a cache chiplet on top of the CCD, but they didn't go through with it. TSMC's packaging technology may have been low yielding at the time. Zen 3 finally realized the vision with the 5800X3D, a while after the initial chips came out. Zen 4 introduced dual-CCD X3D chips with only one CCD having the extra cache, and various excuses were given. However, Epyc chips with all the CCDs having 3D V-Cache have been produced.


Zen 5 fundamentally changed how it worked, putting the cache below the chiplet which has several downsides, all apparently outweighed by the improved thermal situation and higher clocks.

Every CCD from Zen 2 to Zen 5 has had 32 MiB of L3 cache, and since Zen 3 it has been unified. Zen 3/4/5 X3D CCDs add a 64 MiB cache chiplet to reach 96 MiB.

With Zen 6, the CCD gets 48 MiB unified L3 cache. The cache chiplet should be 96 MiB, for a total of 144 MiB. Otherwise it should look similar to Zen 5, except that the I/O chiplet and CCD will now be physically touching and connected by bridge dies for lower latency. This is probably going to change the thermal situation yet again.

Intel has various packaging technologies like "Foveros", and has stacked chiplets before, such as in that crappy "Lakefield" product. They've moved to chiplets in desktop and mobile. I don't think they are going to have much trouble producing an X3D competitor. And we saw from 5800X3D on that many games could automatically take advantage of the tripled cache.
 
  • Informative
Reactions: Ibanez RG 350EX
Intel has various packaging technologies like "Foveros", and has stacked chiplets before, such as in that crappy "Lakefield" product. They've moved to chiplets in desktop and mobile. I don't think they are going to have much trouble producing an X3D competitor.
The real question really is going to be whether or not extra V-Cache on Intel has the same massive performance gains that it does for AMD.
 
The real question really is going to be whether or not extra V-Cache on Intel has the same massive performance gains that it does for AMD.

Unlike AMD which has given the same cache per CCD for most 6/8/12/16-core products, for many generations Intel has stepped up the amount of L3 cache gradually as you go up the lineup. And you can see minor performance gains from the cache alone.

I think it's going to end up depending on the games, and an Intel 3D cache part is going to act mostly the same as AMD's. But there are other considerations like the added latency, and bandwidth of the chiplet.

Going to watch this now:

  • "Magnus" is 192-bit, not 384-bit, using GDDR7. 68 ~RDNA5/UDNA1 compute units, disabled from 70. Could be the next Xbox instead of PS6.
  • "Magnus" apparently shares the graphics chiplet with RDNA5/UDNA1 desktop GPUs.
  • Flagship RDNA5 gaming GPU listed has: 154 CUs, 36 GB of 36 Gbps GDDR7 (384-bit).
  • 192-bit with 18 GB.
  • 160-bit with 15 GB.
  • 128-bit with 12 GB. No more 8 GB.
  • 36 Gbps memory used for all four could be a placeholder. Expect slower.
  • They may be disaggregating the PCIe controller and some other functions onto a skinny chiplet so that a GCD can use PCIe 6 for enterprise models or cheaper PCIe 5 for consumer.
rdna5_lineup.webp
 
Last edited:
  • Lunacy
Reactions: StacticShock
Correct me if I'm wrong but AMD built their entire architecture around 3D V-Cache. They've spent the entire lifespan of Ryzen aiming to produce the design that ended up being the 9800X3D.
You're wrong. The idea for 3D V-Cache came up at some point in the development phase of Zen 3 EPYC, largely in response to the poor showing of 64-core Zen 2 EPYCs in compute-dominated markets, like HPC. It's absolutely dominant there, and AMD has basically cleaned Intel out of that market segment almost entirely. In the desktop space, its primary function is to boost the Ryzen brand as a whole. Leading the charts with your enthusiast product has significant effects on brand perception and the success of downmarket products. Few gamers are actually running games at 1080p on high-end cards in order to actually achieve CPU bottlenecking.
 
Last edited:
It won't. Look, I'm not gonna hide my main motivator behind getting a GPU with 24GB of VRAM, so here's the quick rundown.

For a base model to be usable for NSFW, it first needs to be on an open license. The last Stability AI model that's on such a license was SDXL, as with SD3 they've introduced a restricrive license so no one trains merges on it. For Flux.1, only the Schnell model is on an open license, and right now the only NSFW merge being trained on that is Chroma.

However most sloppers are still using SDXL merges. One, they're lighter, two, they're better. The previous meta was Pony Diffusion, but that had a major issue where you had to use LoRA's for everything as the base model only had one shitty artstyle and couldn't handle characters too well. The current meta however are Noob/Illustrious based models, and those handle just about every artstyle and all you need to do is prompt it e621/Danbooru tags to get what you want. No LoRA'd needed for the most part. It's also insanely good at hands and multi character scenes with just direct prompting. In comparison, Chroma doesn't handle artstyles at all, Lodestones doesn't plan on adding them to the dataset, and Chroma's dataset is still smaller than that of Noob's. Chroma also doesn't handle e6/Dan tags as well.

In short, it won't, because SD3 and Stability AI sucks dick.
I really shouldn't encourage this kind of thing, but I'd love to see your impressions on a second NVlinked 3090 especially because SLI and consumer NVlink is long gone. Even if the most "productive" thing you'd probably do is videogames (with likely awful price or watt to performance yields).

1000010575.webp
 
Flagship RDNA5 gaming GPU listed has: 154 CUs, 36 GB of 36 Gbps GDDR7 (384-bit).
192-bit with 18 GB.
160-bit with 15 GB.
128-bit with 12 GB. No more 8 GB.
why in the fuck are they so stingy with the bus width now? are they using that "running out of silicon" excuse or this is just plain old enshittification? 6 years ago they did 256-bit cards but now at best you get is 192 if you aren't gunning for the pro versions? fucking gay.
Never let chinks or poos take over white companies. These people are categorically incapable of being anything other than good goy stewards who can fellate shareholders extra good.
it's not a big problem per se, what it seems is that they are in a cartel like fake competition shit now that the costs to compete with them have reached crazy levels of high, nothing new.
 
why in the fuck are they so stingy with the bus width now? are they using that "running out of silicon" excuse or this is just plain old enshittification? 6 years ago they did 256-bit cards but now at best you get is 192 if you aren't gunning for the pro versions? fucking gay.
More bus width means more cost and power consumption. Big L2/L3 caches from Nvidia/AMD have mitigated the need for it somewhat. With fast GDDR7 and 3 GB modules, you have more bandwidth and capacity for the same bus width.

Nvidia is only using 30 Gbps GDDR7 so far (5080 only), 36 Gbps would be a 20% jump, or a massive +79% from 20.1 Gbps memory in RDNA4 cards. Since Blackwell seems to have more bandwidth than it needs, conservative bus widths for RDNA5 could be fine.

But don't be surprised if many details turn out wrong because that is a bizarre looking lineup. Though I think a 6090 competitor and huge gap is perfectly plausible if it's going to share silicon with some professional/AI card. 12 GB at 128-bit is also the correct move for low-end. The TDP placeholders seem high, so I wouldn't be shocked to see a weaker 8 GB card using 2 GB modules below that.
 
Last edited:
More bus width means more cost and power consumption. Big L2/L3 caches from Nvidia/AMD have mitigated the need for it somewhat. With fast GDDR7 and 3 GB modules, you have more bandwidth and capacity for the same bus width.

Nvidia is only using 30 Gbps GDDR7 so far (5080 only), 36 Gbps would be a 20% jump, or a massive +79% from 20.1 Gbps memory in RDNA4 cards. Since Blackwell seems to have more bandwidth than it needs, conservative bus widths for RDNA5 could be fine.

But don't be surprised if many details turn out wrong because that is a bizarre looking lineup. Though I think a 6090 competitor and huge gap is perfectly plausible if it's going to share silicon with some professional/AI card. 12 GB at 128-bit is also the correct move for low-end. The TDP placeholders seem high, so I wouldn't be shocked to see a weaker 8 GB card using 2 GB modules below that.
Regarding the 12gb cards in the future. Once the new gen consoles come out, doesn’t 12gb just immediately become the new 8gb? The next gen consoles are rumored to have 24-32gb of system memory, so 16-20gb seems like it’ll become the new standard.
 
  • Thunk-Provoking
Reactions: Brain Problems
Regarding the 12gb cards in the future. Once the new gen consoles come out, doesn’t 12gb just immediately become the new 8gb? The next gen consoles are rumored to have 24-32gb of system memory, so 16-20gb seems like it’ll become the new standard.
The effects of a new console launch certainly aren't immediate, and you could already regard 12 GB as the amount for 1080p *only* right now, with 16 GB for anything else. PS6 (the console that matters) may be coming out later than a next Xbox.

I would like to see 24 GB on 128-bit though. Absurd, but the same number of modules as today's 128-bit 16 GB cards. Depends on the $/GB of 3 GB GDDR7 modules.
 
  • Agree
Reactions: StacticShock
I really shouldn't encourage this kind of thing, but I'd love to see your impressions on a second NVlinked 3090 especially because SLI and consumer NVlink is long gone. Even if the most "productive" thing you'd probably do is videogames (with likely awful price or watt to performance yields).

Yeah that would be a big investment, and would only really see use when doing local video generation since that's still very VRAM and compute heavy. Otherwise a single 3090 handles image generation just fine and it would be wiser for me to save up for an AM5 platform upgrade since the i5-12400 bottlenecks the 3090 in games that actually use the CPU well, like Witcher 3. Otherwise all the old shit that I play that is tied to a single core will struggle either way. Also the amount of dumbfuckery I had to do to improve the thermals on my 3090 and my current case would mean that two 3090's would be rather thermally challenged.
 
  • Like
Reactions: seri0us
why in the fuck are they so stingy with the bus width now?
What matters is total bandwidth. If you can get enough bandwidth with a narrower bus and faster memory, gives you more silicon on the die for other stuff.

Regarding the 12gb cards in the future. Once the new gen consoles come out, doesn’t 12gb just immediately become the new 8gb? The next gen consoles are rumored to have 24-32gb of system memory, so 16-20gb seems like it’ll become the new standard.
Since we're in the era of diminishing qualitative returns, each console gen holds on longer than the last. It is only now, in 2025, that the 11-year-old PS4 is no longer the base platform target of every AAA game. Just about any game that has a PS4 version runs fine on a GTX 970 (I've checked a bunch). If trends hold, the PS5 will be the lead platform of nearly all games until 2031 at the earliest, and probably longer than that.

since the i5-12400 bottlenecks the 3090 in games that actually use the CPU well, like Witcher 3
Based on what I can find, an i5-12400 bottlenecks around 110 fps in Witcher 3. A 3090 bottlenecks at 75 fps on 4K Ultra native, no RTX. So the CPU isn't really the limiting factor unless you're trying to get 120 fps+ by using lower settings. But there are definitely some newer games that need a better CPU. Darktide hits my CPU pretty hard.
 
Last edited:
an i5-12400 bottlenecks around 110 fps in Witcher 3
Yeah, and I have a 1440p 180Hz monitor. Besides, stronger single core performance can benefit old single threaded games and some emulators like 86Box so it would be a worthwhile upgrade. I could then repurpose the old platform for a server since the other PC that I could use is Haswell based and isn't as power efficient. But again, a far fetched idea since the current build is good enough for me.
 
Last edited:
An i5-13500k runs that game at over 200 fps. So if your goal is to hit your monitor refresh rate, you can do that for $140, since 12th, 13th, and 14th gen all use the same socket:
https://www.newegg.com/intel-core-i...-1700-desktop-cpu-processor/p/N82E16819118349
13500K doesn't exist.

I see 12600KF for $100 but it's that Woot site needing Prime: https://slickdeals.net/f/18482338-i...ga-1700-processor-99-99-free-shipping-w-prime
13600KF for $150 also on Woot: https://slickdeals.net/f/18482341-i...0-desktop-processor-150-free-shipping-w-prime
 
13500K doesn't exist.
My bad, no k. Looks like the i5-12600K hits 200 fps in the Witcher 3 as well, according to YouTubers. I guess 6 cores wasn't quite enough. Do people like tables? I made a table. Processors in the same row are almost exactly the same.

P-CoresE-Cores12th13th14th
40i3-12100, i3-12300i3-13100i3-14100
60i5-12400, i5-12500
64i5-12600i5-13400i5-14400
68i5-`13500, i5-13600i5-14500, i5-14600
84i5-12700
88i9-12900i7-13700
812i7-14700
816i9-13900i9-14900
 
Last edited:
My bad, no k. Looks like the i5-12600K hits 200 fps in the Witcher 3 as well, according to YouTubers. I guess 6 cores wasn't quite enough. Do people like tables? I made a table. Processors in the same row are almost exactly the same.

P-CoresE-Cores12th13th14th
40i3-12100, i3-12300i3-13100i3-14100
60i5-12400, i5-12500
64i5-12600i5-13400i5-14400
68i5-`13500, i5-13600i5-14500, i5-14600
84i5-12700
88i9-12900i7-13700
812i7-14700
816i9-13900i9-14900
Oh yeah, besides AVX-512 getting cut from these and my current rig being hacked into running the instructions that are on the silicone, everything above the 12400 has the small cores.
1753474922627.webp
Meaning that it won't fly cuz I ain't got that fixed scheduler in my OS.
 
Back