GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

The Ugly One · Sep 28, 2023

The Mass Shooter Ron Soye said:
Intel's heterogeneous approach is interesting, but overrated in relation to AMD. AMD is expected to follow suit with 4+8 Strix Point mobile (12 cores, 24 threads). If you need more multi-threaded performance than that in a laptop, they will have 16-core Dragon/Fire Range and maybe Strix Halo.

It's not just performance. Laptop CPU sales are 100% to OEMs. The fact that Intel managed to not fall behind (as opposed to servers & workstations) means that when they reach process parity, AMD's inability to build a significant laptop CPU business means they won't have an opening. 2, 3 years ago, AMD could have developed a laptop chip that just killed Intel on all fronts and grabbed significant share of that OEM pie. Now? IMO it's too late already. They might be able to develop some marginal advantage, but a marginal advantage doesn't get anyone at Dell to change a supply chain.

The Mass Shooter Ron Soye said:
AMD got its XDNA AI chip in mobile first. Maybe Intel will beat them in software support, but it's kind of important for the future of AMD and they bought more expertise with the Xilinx acquisiton. I'm looking forward to seeing if AMD pushes XDNA hard in desktop, since they hinted it won't be in Granite Ridge (Zen 5).

Not in a meaningful way. A few months really means nothing for capturing OEM market share. The reason is changing over business relationships is an enormously slow, expensive process. To beat an incumbent, you have to deliver in a big way over a sustained period. AMD was first with 32-core server CPUs with Zen 1, but it didn't really start grabbing market share from Intel in a major way until Zen 2, when it became pretty clear to the people who sell server nodes (Dell, HPE, Supermicro, etc) that Intel had fallen behind and was not going to catch up soon.

By contrast, over 70% of the laptop CPU market share is Intel's, and AMD hasn't been growing in that space meaningfully. The fact that they shoved 7040U out the door a few months before the Meteor Lake launch isn't changing any plans at Dell, HP, or Lenovo, nor is it changing any development plans anywhere. As Intel closes their process node deficit, AMD's window of opportunity is closing as well. Sure, E-cores are coming to AMD, but it's too late. They should have done that in 2020, when Intel was struggling along at 10 nm and failing to meaningfully advance. They'd probably have a majority of the laptop market if they had the foresight to do that, since they'd be crushing Intel on battery life.

Furthermore, software isn't hypothetical. The oneAPI stack beats the shit out of ROCm, because it works. I have it, and I've been using it (and, ironically, using it on a platform with a Ryzen CPU to develop on an NVIDIA GPU). AMD's shit is half-broken all the time. So in consumer-facing AI, right now, I'm looking at a landscape where

I expect Intel to continue have a 70% market share of consumer hardware for the foreseeable future
Intel's developer tools work
Intel's tools support more platforms than AMD's do

So if I'm developing ML-powered software, will I support AMD? Sure. They're 25%-30% of the market, not 0%. But Intel's going to be my lead platform. My AMD support will be to the extent I can easily port my Intel code.

The Mass Shooter Ron Soye said:
Intel is likely to copy AMD and pursue big 3D caches. Probably the Adamantine L4 cache, but not in Meteor Lake. What AMD hasn't done is stick any big caches in mobile other than Dragon Range X3D, or allow it to be accessed by integrated graphics. If they do both, it can have a much bigger impact than squeezing a few more frames out of a 4090.

Big L3 caches are essential in the server space, where engineering workloads (think FEA, CFD, EDA, etc) drive a lot of purchasing, and the price of the extra cache is worth it. Xeon is getting hammered by EPYC X-series there, so I fully expect Intel to follow AMD's lead there soon (Xeon Max's HBM is a stopgap measure). For integrated graphics on mobile, integrated LPDDR5 is going to be a much more effective route than increasing L3 cache, and I expect everyone to follow Apple's lead there.

Post Reply · Sep 28, 2023

The Ugly One said:
but their strategy of using the same architecture on both gaming GPUs and AI/ML accelerators may not be sustainable

I think this will almost certainly prove true, just as it eventually did for crypto mining. I also suspect we may see some big compute names partner with AMD and try to whip ROCm into shape, CUDA simply gives NVidia too much power over their customers in the near term.

The Ugly One · Sep 29, 2023

Post Reply said:
I think this will almost certainly prove true, just as it eventually did for crypto mining. I also suspect we may see some big compute names partner with AMD and try to whip ROCm into shape, CUDA simply gives NVidia too much power over their customers in the near term.

Vendor-specific languages are rapidly becoming about as relevant to day-to-day GPU programming as AVX intrinsics are to CPU programming. You have a few major options to not have to ever touch CUDA or HIP directly, namely, Kokkos, RAJA, and SYCL. There's also a new language out called Mojo, which is very python-like. So I would expect within a year or two, pretty much all GPU applications will be coded in one of these higher-level APIs.

Freedom Fighter · Oct 6, 2023

Gentlemen, I have upgraded from my lowly GTX 970.

...to a GTX 980ti. I love having friends that just give you their old shit.

The Ugly One · Oct 7, 2023

Read this thread for an example of why I say AMD sucks at software:

https://github.com/RadeonOpenCompute/ROCm/issues/1714

The Ugly One · Oct 11, 2023

Intel just launched the Arc A580, which appears to be competitive with the low-end 3000 series GeForces (8 GB, advertised for 1080p). It's not going to win any awards, but on the other hand, it's also only $179.

https://www.newegg.com/sparkle-arc-a580-sa580c-8goc/p/N82E16814993006

JoseRaulChupacabra · Oct 11, 2023

This software talk reminds me of the GPU rendering market. It seems every developer in that sector is interested in CUDA.

It's been over 10 years, and AFAIK, OpenCL based renderers still haven't caught up to CUDA based renderers both in terms of features and market share.

I could be wrong as its been years since I turned that rock.

Premium Dog Food · Oct 11, 2023

JoseRaulChupacabra said:
This software talk reminds me of the GPU rendering market. It seems every developer in that sector is interested in CUDA.

It's been over 10 years, and AFAIK, OpenCL based renderers still haven't caught up to CUDA based renderers both in terms of features and market share.

I could be wrong as its been years since I turned that rock.

Blender dropped OpenCL (for Cycles) in favor of HIP on version 3.0, and it's the only one of the renderers I know that supported OpenCL.
If I recall correctly, AMD just stopped caring about the OpenCL implementation back then to focus on HIP, which has yet to gain any decent traction.
CUDA has become the norm for pretty much all render engines and anything compute wise. Some renderers also support Metal for MacOS machines, but I have yet to see one that supports HIP (besides Cycles)

The Ugly One · Oct 12, 2023

JoseRaulChupacabra said:
This software talk reminds me of the GPU rendering market. It seems every developer in that sector is interested in CUDA.

It's been over 10 years, and AFAIK, OpenCL based renderers still haven't caught up to CUDA based renderers both in terms of features and market share.

I could be wrong as its been years since I turned that rock.

OpenCL is one of the worst APIs ever devised by mankind. Since it's a C API, there are no lambdas, no algorithms, no function objects, etc, meaning there aren't a lot of ways to express the concept of a kernel. So what they did is use strings. Check out this example:

https://github.com/rsnemmen/OpenCL-examples/tree/master/add_numbers

What it does is read in a program from a FILE pointer, create a char ** pointer from it, and pass it to OpenCL. The GPU-accelerated kernel is in add_numbers.cl. That's the file that is loaded, parsed, and and compiled by the OpenCL framework at run time. Imagine trying to work on an application at any scale where you can't compile the code that does the bulk of the work until you run it. Where if you want to pass a small piece of code to the GPU, you write it as a fucking string.

AMD shot itself in the dick by going with OpenCL the first time around. CUDA had the disadvantage of being proprietary, but it had the advantage of not being toxic nightmare fuel, and that's why NVIDIA won. AMD has tried to fix things with HIP, but the rollout has been disastrously bad. They are trying to fix it, but in the end, I think Intel's going to win the API war, because you can write in SYCL and run on all three GPUs, Intel FPGAs, and Gaudi.

JoseRaulChupacabra · Oct 12, 2023

The Ugly One said:
OpenCL is one of the worst APIs ever devised by mankind. Since it's a C API, there are no lambdas, no algorithms, no function objects, etc, meaning there aren't a lot of ways to express the concept of a kernel. So what they did is use strings. Check out this example:

https://github.com/rsnemmen/OpenCL-examples/tree/master/add_numbers

What it does is read in a program from a FILE pointer, create a char ** pointer from it, and pass it to OpenCL. The GPU-accelerated kernel is in add_numbers.cl. That's the file that is loaded, parsed, and and compiled by the OpenCL framework at run time. Imagine trying to work on an application at any scale where you can't compile the code that does the bulk of the work until you run it. Where if you want to pass a small piece of code to the GPU, you write it as a fucking string.

I'm confined to the plug textures, set sampler settings, color mapping, etc. part of the process and know almost nothing about software. So I might as well be reading Chinese with that.

It does sound like a nightmare though.

The Ugly One said:
AMD shot itself in the dick by going with OpenCL the first time around. CUDA had the disadvantage of being proprietary, but it had the advantage of not being toxic nightmare fuel, and that's why NVIDIA won. AMD has tried to fix things with HIP, but the rollout has been disastrously bad. They are trying to fix it, but in the end, I think Intel's going to win the API war, because you can write in SYCL and run on all three GPUs, Intel FPGAs, and Gaudi.

When I was in college and got a gaming/workstation, my friend chewed me out for going for green team and told me that OpenCL was the future. Obviously, that didn't pan out, but even if it did, the existing rendering engines of the time were either CPU based or CUDA based. You can't learn from non-existent engines.

While I would love OpenStandards™ to be the mainstream, the reality is, work needs to get done. If proprietary solutions are what gets things done, then proprietary it is. Any costs that might get incurred by usually more expensive green stuff is just going to get passed to the client.

Anyway, it's been years, but last time I checked, CPU rendering is still the go to for any large work. It's what everyone knows, and its what every computer has, and works the same team red or team blue. So no surprise requirements. Just moar cores and moar RAM. It's also the most feature complete.

The Ugly One · Oct 12, 2023

JoseRaulChupacabra said:
I'm confined to the plug textures, set sampler settings, color mapping, etc. part of the process and know almost nothing about software. So I might as well be reading Chinese with that.

OK, let me break it down into lay programmer's terms (I'm assuming you write some code).

C code:

Code:

for (int i = 0; i < N; ++i)
  a[i] += b[i];

I compile it with a C compiler, and if I fucked up the syntax, the compiler calls me a retard and tells me to fix my shit.

OpenCL code (egregiously simplified to be readable, don't @ me):

Code:

char *myCode =  "for (int i = 0; i < N; ++i)   a[i] += b[i];"

clDoStuff(myCode);

The code that does work is a string. It doesn't even get parsed until I run it, so I don't know if I wrote it correctly until my code is running, making debugging it an absolute nightmare. 100% of OpenCL defenders defend it "because it's open," not because it's good. Nobody thinks it's good. The fact that Apple, who invented it, has abandoned it in favor Metal tells you all you need to know.

JoseRaulChupacabra said:
While I would love OpenStandards™ to be the mainstream, the reality is, work needs to get done. If proprietary solutions are what gets things done, then proprietary it is.

Everyone had a massive love-in for open standards in the late 90s and 00s, and unfortunately, some of those people are in positions of power at AMD, pursuing crappy FOSS solutions because they're "open" and for no other good reason. NVIDIA sees the writing on the wall, though, and they've got a team working on being able to just write standard C++ code, compile it with their compiler, and run it on a GPU. It turns out that openness on its own is not intrinsically a good thing. If something's a giant pile of shit and open, nobody cares. As you said, I gotta get work done, and if the only thing that isn't a turd is proprietary, well, guess I'll be paying the license fee to use it. For this reason, every Linux vs Windows thread devolves into Team Open vs Team Get Work Done, every time. IME the virtue of open source is that it provides a place for giant corporations to lay down their arms, agree to collaborate on standards, and get something done for mutual benefit. When they actually do that, like with LLVM, Open MPI, Linux, etc, you get some really good stuff out of it. Right now, frankly, nobody's cooperating. Apple has Metal, Intel has oneAPI, NVIDIA has CUDA, and AMD has HIP.

I do not want to write my code four fucking times, you goddamned assholes. Somebody has to win, and I'm hoping it's Intel, because it's the least bad of the four. (I actually hope datacenter GPUs die, because SIMT is a horrible programming paradigm.)

JoseRaulChupacabra said:
Anyway, it's been years, but last time I checked, CPU rendering is still the go to for any large work. It's what everyone knows, and its what every computer has, and works the same team red or team blue. So no surprise requirements. Just moar cores and moar RAM. It's also the most feature complete.

It's because large-scale rendering was a niche application forever. GPUs communicated with CPUs over the PCIe bus, which was just way too slow, typically around half the speed of the IB network or less, making GPU -> Main memory -> Network too slow to be useful. With AI/ML exploding and turning into an infinity gazillion dollar industry, everyone is now taking large-scale batch processing of GPU code on low-latency networks seriously. So now you have NVLink going straight to the network, CUDA-aware MPI, and other technologies to enable HPC-scale GPU computations. You'll see more and more HPC renderers support GPUs as accelerated clusters become more common.

Betonhaus · Oct 13, 2023

I was given a MSI Trident 3 (10th gen, GTX 1660 SUPER, i5-10400F) and i was wondering if i should try to sell it, or use it to replace my current system (Ryzen 3 3600, Dell OEM RX 5700) and sell that. I'm not gaming as much so i don't really need the added power, and the smaller system would fit on my desk better (tho make drilling vent holes in my desk moot now)

TheRedChair · Oct 13, 2023

Betonhaus said:
I was given a MSI Trident 3 (10th gen, GTX 1660 SUPER, i5-10400F) and i was wondering if i should try to sell it, or use it to replace my current system (Ryzen 3 3600, Dell OEM RX 5700) and sell that. I'm not gaming as much so i don't really need the added power, and the smaller system would fit on my desk better (tho make drilling vent holes in my desk moot now)

The Dell RX 5700 is probably a VisionTek card and IMHO they run well and is in my current setup being undervolted.

I also have the Ryzen 3600 in my back up computer with a 1070 in it. .

If I was given a computer like what you got I would keep it as a back up as I have done.
I've really have not sold much of my equipment as ironically some of it I still use.

Betonhaus · Oct 13, 2023

TheRedChair said:
The Dell RX 5700 is probably a VisionTek card and IMHO they run well and is in my current setup being undervolted.

I also have the Ryzen 3600 in my back up computer with a 1070 in it. .

If I was given a computer like what you got I would keep it as a back up as I have done.
I've really have not sold much of my equipment as ironically some of it I still use.

yeah the more i think about it the more that makes sense. I'll probably use it to replace the busted laptop I'm using as a HTPC. Maybe throw some emulators onto it.

The Ugly One · Oct 14, 2023

Betonhaus said:
I was given a MSI Trident 3 (10th gen, GTX 1660 SUPER, i5-10400F) and i was wondering if i should try to sell it, or use it to replace my current system (Ryzen 3 3600, Dell OEM RX 5700) and sell that. I'm not gaming as much so i don't really need the added power, and the smaller system would fit on my desk better (tho make drilling vent holes in my desk moot now)

The performance difference between those two systems should be miniscule, so if you prefer the smaller form factor, you should get rid of the Ryzen-based system.

WelperHelper99 · Oct 15, 2023

chrome_screenshot_Oct 11, 2023 7_33_06 AM MDT.png

So over prime days, I bought the Ryzen 9 5900X. 12 cores, 3.7 ghz base, 4.8 ghz boost speed. This is the fastest core I've ever held in my own two hands, yet it's more in the middle of Ryzen's line up. What can it do? Can it play modern games at 1080p at a speedy framerate with little to no lag, even with a moderate sized card and 32 gigs of ram?

Susanna · Oct 15, 2023

WelperHelper99 said:
Can it play modern games at 1080p at a speedy framerate with little to no lag, even with a moderate sized card and 32 gigs of ram?

That will depend on the graphics card, but yes.

WelperHelper99 · Oct 15, 2023

snov said:
That will depend on the graphics card, but yes.

At minimum a 3060. I'm looking at others, but with my budget, a 3060 is about as much as I want to spend asides from the power unit, which in going all out on. There is the 4060 in a similar price bracket, but it's VRAM is 8 gigs compared to the 3060's 12 gigs.

Susanna · Oct 15, 2023

WelperHelper99 said:
At minimum a 3060. I'm looking at others, but with my budget, a 3060 is about as much as I want to spend asides from the power unit, which in going all out on. There is the 4060 in a similar price bracket, but it's VRAM is 8 gigs compared to the 3060's 12 gigs.

The 3060 is essentially just a 1080Ti with (very limited) raytracing (in many benchmarks it actually performs worse), which in 1080p gaming you’re unlikely to want anyway (turning it on will drop you from 100fps to 10). The only benefit it really has is DLSS support. It’s honestly not great value. You may want to consider the RX6700 instead, it has better rasterisation performance (=more fps or higher resolution when raytracing is off), and now that FSR3 is finally a thing, the lack of DLSS support is less of an issue.

Of course if you want to play with AI, the 3060 is a better choice. You can run things like stable diffusion on AMD, but it’s a hassle. Nvidia is a lot more plug and play in that regard.

WelperHelper99 · Oct 15, 2023

snov said:
The 3060 is essentially just a 1080Ti with (very limited) raytracing (in many benchmarks it actually performs worse), which in 1080p gaming you’re unlikely to want anyway (turning it on will drop you from 100fps to 10). The only benefit it really has is DLSS support. It’s honestly not great value. You may want to consider the RX6700 instead, it has better rasterisation performance (=more fps or higher resolution when raytracing is off), and now that FSR3 is finally a thing, the lack of DLSS support is less of an issue.

Of course if you want to play with AI, the 3060 is a better choice. You can run things like stable diffusion on AMD, but it’s a hassle. Nvidia is a lot more plug and play in that regard.

The ai thing I'm interested in too, anime pictures and all that, which is why I've been looking at the 3060 for the most part. It gets done what I need at a budget price, which is all I'm asking. And from what you're saying, it's not the fanciest thing in the world, but it gets the job done.

GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

Piss Towel Therapist

Don't leave that comment in draft

Piss Towel Therapist

At least it's not a Boeing!

Piss Towel Therapist

Piss Towel Therapist

et doigte...

ontologically stochastic

Piss Towel Therapist

et doigte...

Piss Towel Therapist

Irrefutable Rationality

Ultimate Chaos, Ultimate Confort.

Irrefutable Rationality

Piss Towel Therapist

Unlimited Sneed Works

Ruin is inevitable, and all else is prelude

Unlimited Sneed Works

Ruin is inevitable, and all else is prelude

Unlimited Sneed Works