GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

I reckon AI will be the next big thing in gaming (PS6 and on). And they’ll probably push that to GPU as well.
There are efforts being made already to use a neural network as an entire game engine itself: https://gamengen.github.io/
I don't think this is how it will end up, but it is an interesting experiment that demonstrates what is possible.
At the very least the underlying game engine could render the graphics in Minecraft-like quality, then use a neural network (either via tensor cores, analog computing, or whatever) in order to enhance the graphics in a more efficient manner.
 
That's where lots of the large-scale AI improvements will probably come from, but for personal computers I'm seeing a stagnation, but one where you still need to buy new hardware because of security issues.
Yup. Moore’s law is basically over. One day, sooner rather than later, the fab game will just end. Not because it’s not possible to go smaller, but because it’s just too expensive.

Everything after that will be tiny, incremental improvements because new design rules allow transistors to be packed 1% denser or whatever.

Outside of gaming, I fear what that will mean to society as a whole.
The industry has barely touched 3D yet. Monolithic 3D chips with memory and logic mashed together could increase performance by orders of magnitude. Until the necessary technologies are ready, the industry can continue to slow walk with new, slightly better nodes. For example, TSMC's A14 is coming around 2028, and they'll probably have a few more traditional nodes through at least the early 2030s.
 
There are efforts being made already to use a neural network as an entire game engine itself: https://gamengen.github.io/
I don't think this is how it will end up, but it is an interesting experiment that demonstrates what is possible.
At the very least the underlying game engine could render the graphics in Minecraft-like quality, then use a neural network (either via tensor cores, analog computing, or whatever) in order to enhance the graphics in a more efficient manner.
That’s interesting, but I was thinking more of using AI to create levels. Or for NPCs. Adjust difficulty, etc.

Imagine GTA where everyone you see has some basic level of Chat GPT IQ and intelligent behavior.

The industry has barely touched 3D yet. Monolithic 3D chips with memory and logic mashed together could increase performance by orders of magnitude. Until the necessary technologies are ready, the industry can continue to slow walk with new, slightly better nodes. For example, TSMC's A14 is coming around 2028, and they'll probably have a few more traditional nodes through at least the early 2030s.
The size of SRAM has already reached its limits a while ago, AFAIK.
 
  • Like
Reactions: N Space and Vecr
The size of SRAM has already reached its limits a while ago, AFAIK.
SRAM has almost stopped scaling, but it's still a little smaller on N5/N3 nodes than on N7/N6. 3D V-Cache has used N7 so far.

TSMC's N3 features an SRAM bitcell size of 0.0199µm^², which is only ~5% smaller compared to N5's 0.021 µm^²SRAM bitcell. It gets worse with the revamped N3E as it comes with a 0.021 µm^² SRAM bitcell (which roughly translates to 31.8 Mib/mm^²), which means no scaling compared to N5 at all.
sram-density-tsmc-n3be.jpeg

But TSMC hasn't even moved to GAAFETs/forksheet transistors, backside power delivery, high-NA EUV, etc. They can probably get below 0.020 µm^² on future nodes, for what it's worth.

AMD's 3D V-Cache using TSMC's tech gets around SRAM scaling issues by stacking a one-layer cache chiplet (more layers are possible) on top of the underlying SRAM, connected by through-silicon vias (TSVs). The cache chiplet can use an older, cheaper node. It may not be possible to put all SRAM on the chiplet, meaning you need at least some on the bottom.

3D V-Cache is basic bitch "3D". What I'm talking about could be connected thousands of times more densely than TSVs, with gigabytes of memory, ideally all of the memory your processor needs (e.g. 128 GB+), and possibly multiple layers of cores. The biggest challenge is dealing with the heat.
 
The industry has barely touched 3D yet. Monolithic 3D chips with memory and logic mashed together could increase performance by orders of magnitude. Until the necessary technologies are ready, the industry can continue to slow walk with new, slightly better nodes. For example, TSMC's A14 is coming around 2028, and they'll probably have a few more traditional nodes through at least the early 2030s.
is it possible to stack transistors on top of each other on different layers of the same chip? How do they handle cooling then?

like say you had a four-core cpu that had each core sandwiched on top of each other, would it be impossible to cool it? or would there just be a max tdp it can operate at?
 
is it possible to stack transistors on top of each other on different layers of the same chip? How do they handle cooling then?

like say you had a four-core cpu that had each core sandwiched on top of each other, would it be impossible to cool it? or would there just be a max tdp it can operate at?
There could be microfluidic channels or something else that sounds exotic to cool it. New materials could be used. The power needed and heat generated can be brought down by backside power delivery. Getting the memory closer cuts down on a major source of power consumption, from moving data around.

There are complementary FETs on Intel's roadmap that stack two sets of nanosheets but I don't think it qualifies as more than a single transistor:

All-in-one design integrates microfluidic cooling into electronic chips
At IEDM 2023, Intel showcases 3D stacked CMOS transistors combined with backside power and direct backside contact – first-of-a-kind advancements that will extend Moore’s Law.
Intel Shows New Stacked CFET Transistor Design At ITF World
 
  • Thunk-Provoking
Reactions: Vecr and Betonhaus
3D V-Cache is basic bitch "3D". What I'm talking about could be connected thousands of times more densely than TSVs, with gigabytes of memory, ideally all of the memory your processor needs (e.g. 128 GB+), and possibly multiple layers of cores.

HBM is already doing this, just not hundreds of GB. Each HBM stack on an H200 is I think ~24 GB. Cooling's already an issue, though, and it has a high failure rate.

is it possible to stack transistors on top of each other on different layers of the same chip? How do they handle cooling then?

like say you had a four-core cpu that had each core sandwiched on top of each other, would it be impossible to cool it? or would there just be a max tdp it can operate at?

Heat can only leave the chip through the surface, so if you sandwich cores like that, the heat from inner cores has to pass through outer cores, causing a serious cooling problem, requiring some new, exotic technology. It might be more feasible with an all-photonics chip.

Unrelated: Since we were talking about servers, AMD just launched Turin, while Intel launched 6th Gen Xeon (Granite Rapids). It looks like Intel has finally gotten back to parity in server CPUs, as both companies' high-performance flagship CPUs have 128 cores.

They both also launched high-density solutions. Zen 5c and Intel E-Cores aren't strictly comparable, but those come in 192c and 288c versions, respectively.

Intel actually leapfrogged AMD in bandwidth this time, which is important to anybody running memory-bound workloads like engineering simulations or databases. Both Intel and AMD CPUs have 12 channels of DDR5 this goaround, but Intel supports DDR5-8000 MCR DIMMs, which multiplex two 64-bit channels into a single 128-bit channel.

AMD has been kicking ass in server chips since Zen 2, so I'm curious to see how the market responds to Zen 5 vs GR/SF. On paper, the latter is better.
 
There could be microfluidic channels or something else that sounds exotic to cool it. New materials could be used. The power needed and heat generated can be brought down by backside power delivery. Getting the memory closer cuts down on a major source of power consumption, from moving data around.

There are complementary FETs on Intel's roadmap that stack two sets of nanosheets but I don't think it qualifies as more than a single transistor:

All-in-one design integrates microfluidic cooling into electronic chips
At IEDM 2023, Intel showcases 3D stacked CMOS transistors combined with backside power and direct backside contact – first-of-a-kind advancements that will extend Moore’s Law.
Intel Shows New Stacked CFET Transistor Design At ITF World
so like if in between the traces of the actual circuits themselves there is an array of vertical channels, effectively making the cpu porous so that a special coolant could be pumped through it in a sort of self contained liquid cooling dealio?
 
  • Thunk-Provoking
Reactions: Vecr
so like if in between the traces of the actual circuits themselves there is an array of vertical channels, effectively making the cpu porous so that a special coolant could be pumped through it in a sort of self contained liquid cooling dealio?
That could be how it works. But there are going to be multiple competing approaches developed in labs over the next decade or two, and one of them will win out in the end.

If the solution is so exotic that it can't be easily put into consumer devices, that's ok. Big companies will make and buy things like the Wafer Scale Engine if they work.
 
  • Informative
Reactions: Gog & Magog
That could be how it works. But there are going to be multiple competing approaches developed in labs over the next decade or two, and one of them will win out in the end.

If the solution is so exotic that it can't be easily put into consumer devices, that's ok. Big companies will make and buy things like the Wafer Scale Engine if they work.
i think it's possible using existing technology.

The individual subunits of each core or memory bank would be stacked on top of each other, surrounded by channels for liquid in the most efficient shapes possible. The liquid would be in a sealed unit that has a radiator where traditional liquid cooling would hook into. The CPU liquid would be completely sealed so that nothing can plug the microscopic pores, with a contained pump to circulate the liquid through the cooler. The CPU itself would be like a square block with ports that liquid cooling pipes connect to directly.

Nothing here would require a radical approach with new technology; just figuring out how to communicate between layers, arrange the traces around coolant pores, and have a very tiny pump and radiator. Liquid cooling would be mandatory, but it may be possible to get by with a radiator that sits directly on the cpu and is basically just a big air cooler in size

i have difficulty drawing but i will describe it as follows:
1. the actual circuitry would be rotated 90 degrees, so that one side would be facing down and that side would contain the bus that connects to the motherboard. the cpu would be long and skinny. there would be chambers on either side of it for liquid that gets pumped through the cpu. on top of it would be the pump which pushes the coolant through a heat exchanging radiator, which then has a compartment where standard liquid cooling hooks into. from the outside the CPU would look like a rectangular block, like two or three cubes lined up with ports on the top of it. the bottom side would have all the pins, and there would be a bracket to secure it to the motherboard.
 
Last edited:
i have difficulty drawing but i will describe it as follows:
1. the actual circuitry would be rotated 90 degrees, so that one side would be facing down and that side would contain the bus that connects to the motherboard. the cpu would be long and skinny. there would be chambers on either side of it for liquid that gets pumped through the cpu. on top of it would be the pump which pushes the coolant through a heat exchanging radiator, which then has a compartment where standard liquid cooling hooks into. from the outside the CPU would look like a rectangular block, like two or three cubes lined up with ports on the top of it. the bottom side would have all the pins, and there would be a bracket to secure it to the motherboard.
Wouldn't this make the CPU extremely fragile and severely reduce its lifespan due to thermal stress? I think microscopic heat pipes sandwiched between stacked chips or simply going through them is more realistic
 
Wouldn't this make the CPU extremely fragile and severely reduce its lifespan due to thermal stress? I think microscopic heat pipes sandwiched between stacked chips or simply going through them is more realistic

The short version is you need to account for that in the design. Based on Alder Lake's issues with bending, chip designers aren't accustomed to having to do mechanical engineering. But that would need to be added to the design process.
 
Graphics are GOOD ENOUGH, and the problem is that graphics that get better and better, also mean games get more and more expensive. One hundred million dollars is like the low end baseline for a AAA game these days.
There's a reason why some of the best modern games lately have been graphically simpler like FTL, KSP, Soma, Hollow Knight, Hotline Miami, Machinarium, etc.
 
  • Like
Reactions: Brain Problems
Why not have it shift from blue to yellow to red depending on the time of day?
Yellow during the day works I guess- I generally find red lamps better even just for waking up.
God no. Remember the huge laptops we used to lug around?

I seriously think we’re reaching an inflection point though.

Graphics are GOOD ENOUGH, and the problem is that graphics that get better and better, also mean games get more and more expensive. One hundred million dollars is like the low end baseline for a AAA game these days.
Wrong. Graphics haven't advanced since Far Cry 1 and the Tenebrae mod for Quake 1.
 
back to my usecase of a virtual Windows VM with 8/16 of a 2696 v3 and 32gb ram, what would be the best GPU at the $200 price point and at the $400 price point, for 3d CAD design? or would there not be meaningful benefits over the k6000 with 12gb except for power efficiency?
I have 2x K6000s pulled from a server. Tried to run pytorch on them but the chip is so old that you have to use some weird legacy version and it runs unbelievably slow. It's got VRAM but it has no guts at all, the CUDA cores belong in a retirement village.

Are you in Australia? I'll literally give you these cunts for free. I'm sick of looking at them.

Edit: I was going to write about Nvidia drivers refusing to work on consumer cards when PCI-E passthrough is used but apparently they ripped that out of the driver 3 years ago and I never got the memo. So your options are pretty broad and you may be able to get something better that just isn't branded Quadro and still pass it through to the guest just fine.
 
Last edited:
I have 2x K6000s pulled from a server. Tried to run pytorch on them but the chip is so old that you have to use some weird legacy version and it runs unbelievably slow. It's got VRAM but it has no guts at all, the CUDA cores belong in a retirement village.

Are you in Australia? I'll literally give you these cunts for free. I'm sick of looking at them.
You don't get tensor cores until V100, I think. It's pointless to try and do something cool on anything older.
 
I have 2x K6000s pulled from a server. Tried to run pytorch on them but the chip is so old that you have to use some weird legacy version and it runs unbelievably slow. It's got VRAM but it has no guts at all, the CUDA cores belong in a retirement village.

Are you in Australia? I'll literally give you these cunts for free. I'm sick of looking at them.

Edit: I was going to write about Nvidia drivers refusing to work on consumer cards when PCI-E passthrough is used but apparently they ripped that out of the driver 3 years ago and I never got the memo. So your options are pretty broad and you may be able to get something better that just isn't branded Quadro and still pass it through to the guest just fine.
I'm using it for 3d rendering for now, I got a p400 I can try comparing it to but it likely will be fine for that style of workload. The k6000 is literally only for a Windows VM, and everything that can run on Linux I got an Intel arc a310 I can use for AI shit, and might pick up a a770 if it is cheap enough (it likely won't be once bidding kicks in)
 
Can anyone explain to me what GPU Voltage Offset is in simplified form? I'm trying to undervolt + overclock my 6750XT in LACT.
 
Can anyone explain to me what GPU Voltage Offset is in simplified form? I'm trying to undervolt + overclock my 6750XT in LACT.
If you were to plot a graph of frequency vs voltage for your card in stock configuration, you would see a curve that describes the frequency (x-axis) a card will run at when supplied a given voltage (y-axis). This chart is for a CPU, but the idea is the same
1728662613913.png

The default curve for most components is far more cautious than it needs to be to account for variations in manufacturing. You can often lop a few hundred millivolts off the top and still hit all the frequencies in the curve while generating less heat (and generating less heat = you'll be allowed to boost to higher frequencies). So in terms of the above graph, an offset is a transformation that shifts the graph up or down depending on whether it's negative or positive. For an undervolt, your offset should be negative and for an overvolt it should be positive. Because this is a linear transformation, it'll alter behavior at all frequencies - it's worth noting that cards can often handle far more undervolting for certain frequencies than others so a simple voltage offset often isn't the most effective way to do this, but it is simple and doesn't involve spending hours tweaking curves.

tl;dr negative voltage offset = less heat = higher clocks possible, but also you need to test for stability after altering it. I'd start at -100 mV and go from there.
 
Honey, wake up, new MLID vid with some leaks about 50-series pricing
In his monthly livestream, he says there will be an 18 GB version of the 5070 at some point.

For anyone who doesn't remember, 2 GB GDDR7 modules are available now, but 3 GB GDDR7 will be available soon, so any particular card could get +50% capacity without changing the memory bus.
 
Back