GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

I haven't played around with LLMs yet though I hope to. But my early investigations of it suggest that it wants very high RAM. I.e. 30GB+. I think only a few can you hope to run in normal domestic machines. Training might be out of the question. Happy to be corrected though - I believe there are some newer ones that can be run on high spec home machines...
32GB of RAM isn't that much anymore now that 16GB its basically the new minimum, now if we're talking far more then it gets tricky as most motherboards don't support more than 64GB.

Anyway, seems the CPU part was one particular fork of LLaMA for system with no nvidia GPUs like the ARM apple SoCs so it leverages the CPU more.
 
32GB of RAM isn't that much anymore now that 16GB its basically the new minimum, now if we're talking far more then it gets tricky as most motherboards don't support more than 64GB.
Ah, perhaps I relied too much on context and should have been clearer. I was talking VRAM. SO 32GB is still a large amount for non-professional GPUs.

I believe running an LLM on CPU is possible but a great deal slower. And again, I'm not sure it isn't much, much higher RAM for actually training the models.
 
DVujNqQ9kLcKbcDiZ4bJHm-970-80.png
Surprised the 6800 its that high, not bad for GPU thats less than $300 used.

Too bad I need something that's also good for AI, and that means dealing with nvidia...
I was talking VRAM.
Oh hell, 32GB of VRAM? that's V100 territory, $2500 used, maybe a little lower but still way above the total cost of the average gamer PC.
 

Speculation about 7600 XT 12/10 GB variations. Could be cut down Navi 32 or rebranded 6700 (XT).

Oh hell, 32GB of VRAM? that's V100 territory, $2500 used, maybe a little lower but still way above the total cost of the average gamer PC.
I think if AMD or Nvidia don't toss a 32 GB "consumer" card out to the plebs next-gen (RDNA4 vs. Blackwell), they'll do it during the one after that. I was surprised that AMD went to 20-24 GB this gen.
 
Last edited:
  • Optimistic
Reactions: cybertoaster
I think if AMD or Nvidia don't toss a 32 GB "consumer" card out to the plebs next-gen (RDNA4 vs. Blackwell), they'll do it during the one after that. I was surprised that AMD went to 20-24 GB this gen.
It would be funny if the lunacy continues and the plain Geforce 6060 is 32GB while the faster 6060Ti is 24GB. It probably wouldn't matter as much in games at that point though.
 
It would be funny if the lunacy continues and the plain Geforce 6060 is 32GB while the faster 6060Ti is 24GB. It probably wouldn't matter as much in games at that point though.
In terms of market segregation that’s not unheard of. The A4000 performs slightly worse than the 3080Ti in most applications, but it has significantly more memory, allowing it to run more professional tasks, if often a bit slower than the gaming card. It would be unusual to market them both as gaming cards though, your 6060 would probably be sold as the Quadro RTX A6000 or something (and at twice the price).
 
  • Like
Reactions: The Ugly One
On the opposite end of what GPUs can do, I've started learning to code in SYCL. Intel's got a very slick tutorial to get set up using VS Code as your editor, WSL as the backend, and compiling to x86 + Intel GPU as your target, or x86 + NVIDIA. I was quite surprised to learn that these ho-hum little iGPUs Intel puts on seemingly every desktop CPU now are capable of general computing. It'll do linear algebra faster than the equivalent multithreaded code running on all 8+8 cores.

If any of you are C++ programmers, I strongly recommend at least working through a couple tutorials and familiarizing yourself with the device/host programming paradigm. The concepts in SYCL map onto CUDA, HIP, and Kokkos/RAJA, but it's easier to get started.

Too bad I need something that's also good for AI, and that means dealing with nvidia...

Intel's software stack is rapidly catching up, but they don't sell workstation GPUs, and their datacenter GPU is very, very scarce.
 
I was quite surprised to learn that these ho-hum little iGPUs Intel puts on seemingly every desktop CPU now are capable of general computing. It'll do linear algebra faster than the equivalent multithreaded code running on all 8+8 cores.
Intel's iGPUs surprise in rather interesting ways. When it comes to en/de/transcoding and QSV, they must have some black magic at work.
 
Intel's iGPUs surprise in rather interesting ways. When it comes to en/de/transcoding and QSV, they must have some black magic at work.

Really, the only thing special about them is they're built on Xe architecture, so it's as trivial to run generic code on them as it is on NVIDIA GPUs. More trivial, even, because SYCL is just plain an easier API to learn than CUDA. You can't do the same with AMD iGPUs because they don't support HIP.
 
You can't do the same with AMD iGPUs because they don't support HIP.
Did anything change with the new RDNA based AMD iGPUs? I remember they used Vega for the longest time on their APUs and AMD has the worst consistency when it comes to what works on which arch.
 
Did anything change with the new RDNA based AMD iGPUs?

No. GPU world at AMD is, speaking strictly as an observer/user/etc, an absolute fuckin trash fire of inconsistent architecture and horrendously bad software support. Last time I looked at the ROCm documentation online, it was full of 404s and doxygen errors. Even in gaming GPUs, frankly, FSR just kind of sucks. Intel and NVIDIA have both gone with ML-tuned algorithms, and the difference is pretty stunning (note that XeSS only does the ML stuff on Arc GPUs; it basically downgrades to something like FSR on non-Intel architectures). Let me see if I can throw something together.
 
Intel's software stack is rapidly catching up, but they don't sell workstation GPUs, and their datacenter GPU is very, very scarce.
The problem with intel GPUs is that they seem to lag behind performance a lot for their price point, or did I miss something about a new ARC card that just came out? because event the top A770 scores like half the power of a 3060 for a card that costs about 50% more, that without considering all the used 3060s out there.
 
The problem with intel GPUs is that they seem to lag behind performance a lot for their price point, or did I miss something about a new ARC card that just came out? because event the top A770 scores like half the power of a 3060 for a card that costs about 50% more, that without considering all the used 3060s out there.

A770 has 16 GB. 3060 Ti only has 8. For any kind of compute workload, I'll take 5% slower compute + 2x more RAM any day.
 
For those looking for some CPU oomph to take a break from GPUs for a moment, it looks like Threadripper is returning. AMD expected to announce a Zen 4 based processor in the next few weeks.


All the way up to 96 cores / 192 threads which will no doubt be a hefty price but down to 16 core / 32 thread which is probably affordable for those of that actually have a need for this sort of power. That may seem a low core count but Threadripper platform will get you eight DDR5 slots so 256GB RAM if you want it. And 128 PCI-E v5 lanes. So load it up with GPUs running at full throttle, as much insanely fast SSD as you want...

It's going to be a small proportion of us that actually could justify a build like this but for the home user who wants to, it looks pretty neat. Makes me want to get into LLMs just so I can justify it!
 
For those looking for some CPU oomph to take a break from GPUs for a moment, it looks like Threadripper is returning. AMD expected to announce a Zen 4 based processor in the next few weeks.


All the way up to 96 cores / 192 threads which will no doubt be a hefty price but down to 16 core / 32 thread which is probably affordable for those of that actually have a need for this sort of power. That may seem a low core count but Threadripper platform will get you eight DDR5 slots so 256GB RAM if you want it. And 128 PCI-E v5 lanes. So load it up with GPUs running at full throttle, as much insanely fast SSD as you want...

There's not a lot of reason to buy a 16-core Threadripper over a Ryzen unless you need gobs and gobs of RAM. With DDR5-4800, even highly bandwidth-bound workloads should be able to make use of all 16 cores of a Ryzen. However, I had some of 64-core Zen 3 Threadripper PROs on site, and they paid for themselves many times over in engineer-hours saved. The Zen 2 Threadripper with only 4 memory channels sucked, but at the time, I guess it was good for 16 cores, and Ryzen/Core didn't go that high. We just needed more than that.

Edit:
You can get a 4-channel 16c Zen2 Threadripper for $450. It's got 22% more bandwidth than a current-gen Ryzen (only DDR4-2933), so for certain CAE workloads, it will probably outperform it a bit. Plus you can put a lot more RAM on the motherboard.
 
Last edited:
  • Like
Reactions: Overly Serious
Eh, I've seen a decent number of people really want the extra lanes of the thread ripper platform. They'll be people excited about a new one.

Edit - think power PC users.
 
Last edited:
Back