GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

cybertoaster · Sep 11, 2023

Overly Serious said:
I haven't played around with LLMs yet though I hope to. But my early investigations of it suggest that it wants very high RAM. I.e. 30GB+. I think only a few can you hope to run in normal domestic machines. Training might be out of the question. Happy to be corrected though - I believe there are some newer ones that can be run on high spec home machines...

32GB of RAM isn't that much anymore now that 16GB its basically the new minimum, now if we're talking far more then it gets tricky as most motherboards don't support more than 64GB.

Anyway, seems the CPU part was one particular fork of LLaMA for system with no nvidia GPUs like the ARM apple SoCs so it leverages the CPU more.

Overly Serious · Sep 11, 2023

cybertoaster said:
32GB of RAM isn't that much anymore now that 16GB its basically the new minimum, now if we're talking far more then it gets tricky as most motherboards don't support more than 64GB.

Ah, perhaps I relied too much on context and should have been clearer. I was talking VRAM. SO 32GB is still a large amount for non-professional GPUs.

I believe running an LLM on CPU is possible but a great deal slower. And again, I'm not sure it isn't much, much higher RAM for actually training the models.

cybertoaster · Sep 11, 2023

The Mass Shooter Ron Soye said:

Surprised the 6800 its that high, not bad for GPU thats less than $300 used.

Too bad I need something that's also good for AI, and that means dealing with nvidia...

Overly Serious said:
I was talking VRAM.

Oh hell, 32GB of VRAM? that's V100 territory, $2500 used, maybe a little lower but still way above the total cost of the average gamer PC.

The Mass Shooter Ron Soye · Sep 11, 2023

https://videocardz.com/newz/amd-radeon-rx-7600-xt-to-feature-10gb-and-12gb-memory-configs-according-to-powercolor-eec-filing

Speculation about 7600 XT 12/10 GB variations. Could be cut down Navi 32 or rebranded 6700 (XT).

cybertoaster said:
Oh hell, 32GB of VRAM? that's V100 territory, $2500 used, maybe a little lower but still way above the total cost of the average gamer PC.

I think if AMD or Nvidia don't toss a 32 GB "consumer" card out to the plebs next-gen (RDNA4 vs. Blackwell), they'll do it during the one after that. I was surprised that AMD went to 20-24 GB this gen.

Smaug's Smokey Hole 2 · Sep 12, 2023

The Mass Shooter Ron Soye said:
I think if AMD or Nvidia don't toss a 32 GB "consumer" card out to the plebs next-gen (RDNA4 vs. Blackwell), they'll do it during the one after that. I was surprised that AMD went to 20-24 GB this gen.

It would be funny if the lunacy continues and the plain Geforce 6060 is 32GB while the faster 6060Ti is 24GB. It probably wouldn't matter as much in games at that point though.

Susanna · Sep 13, 2023

Smaug's Smokey Hole 2 said:
It would be funny if the lunacy continues and the plain Geforce 6060 is 32GB while the faster 6060Ti is 24GB. It probably wouldn't matter as much in games at that point though.

In terms of market segregation that’s not unheard of. The A4000 performs slightly worse than the 3080Ti in most applications, but it has significantly more memory, allowing it to run more professional tasks, if often a bit slower than the gaming card. It would be unusual to market them both as gaming cards though, your 6060 would probably be sold as the Quadro RTX A6000 or something (and at twice the price).

Smaug's Smokey Hole 2 · Sep 13, 2023

snov said:
It would be unusual to market them both as gaming cards though,

I was joking about the 3060/3060Ti situation. The cheaper card have 50% more VRAM than the Ti or even the 3070.

The Ugly One · Sep 13, 2023

On the opposite end of what GPUs can do, I've started learning to code in SYCL. Intel's got a very slick tutorial to get set up using VS Code as your editor, WSL as the backend, and compiling to x86 + Intel GPU as your target, or x86 + NVIDIA. I was quite surprised to learn that these ho-hum little iGPUs Intel puts on seemingly every desktop CPU now are capable of general computing. It'll do linear algebra faster than the equivalent multithreaded code running on all 8+8 cores.

If any of you are C++ programmers, I strongly recommend at least working through a couple tutorials and familiarizing yourself with the device/host programming paradigm. The concepts in SYCL map onto CUDA, HIP, and Kokkos/RAJA, but it's easier to get started.

cybertoaster said:
Too bad I need something that's also good for AI, and that means dealing with nvidia...

Intel's software stack is rapidly catching up, but they don't sell workstation GPUs, and their datacenter GPU is very, very scarce.

Waifu Denier · Sep 13, 2023

The Ugly One said:
I was quite surprised to learn that these ho-hum little iGPUs Intel puts on seemingly every desktop CPU now are capable of general computing. It'll do linear algebra faster than the equivalent multithreaded code running on all 8+8 cores.

Intel's iGPUs surprise in rather interesting ways. When it comes to en/de/transcoding and QSV, they must have some black magic at work.

The Ugly One · Sep 13, 2023

Waifu Denier said:
Intel's iGPUs surprise in rather interesting ways. When it comes to en/de/transcoding and QSV, they must have some black magic at work.

Really, the only thing special about them is they're built on Xe architecture, so it's as trivial to run generic code on them as it is on NVIDIA GPUs. More trivial, even, because SYCL is just plain an easier API to learn than CUDA. You can't do the same with AMD iGPUs because they don't support HIP.

Waifu Denier · Sep 13, 2023

The Ugly One said:
You can't do the same with AMD iGPUs because they don't support HIP.

Did anything change with the new RDNA based AMD iGPUs? I remember they used Vega for the longest time on their APUs and AMD has the worst consistency when it comes to what works on which arch.

The Ugly One · Sep 13, 2023

Waifu Denier said:
Did anything change with the new RDNA based AMD iGPUs?

No. GPU world at AMD is, speaking strictly as an observer/user/etc, an absolute fuckin trash fire of inconsistent architecture and horrendously bad software support. Last time I looked at the ROCm documentation online, it was full of 404s and doxygen errors. Even in gaming GPUs, frankly, FSR just kind of sucks. Intel and NVIDIA have both gone with ML-tuned algorithms, and the difference is pretty stunning (note that XeSS only does the ML stuff on Arc GPUs; it basically downgrades to something like FSR on non-Intel architectures). Let me see if I can throw something together.

cybertoaster · Sep 13, 2023

The Ugly One said:
Intel's software stack is rapidly catching up, but they don't sell workstation GPUs, and their datacenter GPU is very, very scarce.

The problem with intel GPUs is that they seem to lag behind performance a lot for their price point, or did I miss something about a new ARC card that just came out? because event the top A770 scores like half the power of a 3060 for a card that costs about 50% more, that without considering all the used 3060s out there.

The Ugly One · Sep 13, 2023

cybertoaster said:
The problem with intel GPUs is that they seem to lag behind performance a lot for their price point, or did I miss something about a new ARC card that just came out? because event the top A770 scores like half the power of a 3060 for a card that costs about 50% more, that without considering all the used 3060s out there.

A770 has 16 GB. 3060 Ti only has 8. For any kind of compute workload, I'll take 5% slower compute + 2x more RAM any day.

Overly Serious · Sep 14, 2023

For those looking for some CPU oomph to take a break from GPUs for a moment, it looks like Threadripper is returning. AMD expected to announce a Zen 4 based processor in the next few weeks.

https://videocardz.com/newz/amd-ryzen-threadripper-pro-7975x-spotted-with-32-zen4-cores

https://archive.md/wip/vrcTv

All the way up to 96 cores / 192 threads which will no doubt be a hefty price but down to 16 core / 32 thread which is probably affordable for those of that actually have a need for this sort of power. That may seem a low core count but Threadripper platform will get you eight DDR5 slots so 256GB RAM if you want it. And 128 PCI-E v5 lanes. So load it up with GPUs running at full throttle, as much insanely fast SSD as you want...

It's going to be a small proportion of us that actually could justify a build like this but for the home user who wants to, it looks pretty neat. Makes me want to get into LLMs just so I can justify it!

Susanna · Sep 14, 2023

Overly Serious said:
eight DDR5 slots so 256GB RAM if you want it

384GB. And it ought to support registered DIMMs, letting you go up to a terabyte.

Overly Serious · Sep 14, 2023

snov said:
384GB. And it ought to support registered DIMMs, letting you go up to a terabyte.

BRB. Studying maths and LLMs so that I can buy one of these...

The Ugly One · Sep 14, 2023

Overly Serious said:
For those looking for some CPU oomph to take a break from GPUs for a moment, it looks like Threadripper is returning. AMD expected to announce a Zen 4 based processor in the next few weeks.

https://videocardz.com/newz/amd-ryzen-threadripper-pro-7975x-spotted-with-32-zen4-cores

https://archive.md/wip/vrcTv

All the way up to 96 cores / 192 threads which will no doubt be a hefty price but down to 16 core / 32 thread which is probably affordable for those of that actually have a need for this sort of power. That may seem a low core count but Threadripper platform will get you eight DDR5 slots so 256GB RAM if you want it. And 128 PCI-E v5 lanes. So load it up with GPUs running at full throttle, as much insanely fast SSD as you want...

There's not a lot of reason to buy a 16-core Threadripper over a Ryzen unless you need gobs and gobs of RAM. With DDR5-4800, even highly bandwidth-bound workloads should be able to make use of all 16 cores of a Ryzen. However, I had some of 64-core Zen 3 Threadripper PROs on site, and they paid for themselves many times over in engineer-hours saved. The Zen 2 Threadripper with only 4 memory channels sucked, but at the time, I guess it was good for 16 cores, and Ryzen/Core didn't go that high. We just needed more than that.

Edit:
You can get a 4-channel 16c Zen2 Threadripper for $450. It's got 22% more bandwidth than a current-gen Ryzen (only DDR4-2933), so for certain CAE workloads, it will probably outperform it a bit. Plus you can put a lot more RAM on the motherboard.

https://www.amazon.com/dp/B07GFN6CVF?th=1&psc=1&language=en_US

Just Some Other Guy · Sep 14, 2023

Eh, I've seen a decent number of people really want the extra lanes of the thread ripper platform. They'll be people excited about a new one.

Edit - think power PC users.

ZMOT · Sep 14, 2023

Just Some Other Guy said:
Eh, I've seen a decent number of people really want the extra lanes of the thread ripper platform. They'll be people excited about a new one.

Edit - think power PC users.

but threadripper isn't a risc processor...?

GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

cybertoaster

Chairman of the mammary regulation committee

Overly Serious

cybertoaster

Chairman of the mammary regulation committee

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

Smaug's Smokey Hole 2

Forgettable2

Susanna

Ruin is inevitable, and all else is prelude

Smaug's Smokey Hole 2

Forgettable2

The Ugly One

Piss Towel Therapist

Waifu Denier

Picture it! Salvation!

The Ugly One

Piss Towel Therapist

Waifu Denier

Picture it! Salvation!

The Ugly One

Piss Towel Therapist

cybertoaster

Chairman of the mammary regulation committee

The Ugly One

Piss Towel Therapist

Overly Serious

Susanna

Ruin is inevitable, and all else is prelude

Overly Serious

The Ugly One

Piss Towel Therapist

Just Some Other Guy

ZMOT

wat