GPUs & CPUs & Enthusiast hardware: Questions, Discussion and fanboy slap-fights - Nvidia & AMD & Intel - Separe but Equal. Intel rides in the back of the bus.

Would you be willing to give me a few pointers to get me started on SD with an AMD card?
Sure thing!

So basically get docker set up on your computer, pull the rocm-pytorch image, and start it. Inside the container you git clone the automatic1111 repo and do the normal setup (same as for Nvidia). The SD general on /g/ has a long list of models you can download, but I'd stick with SD 1.4 and 1.5 unless you specifically want anime-style, in which case I'd do a 70% blend of SD with one or more models trained on anime (avoid the danbooru-only models, it's impossible to make them not be lewd). Launch SD and read the end of the terminal output to get the port (or specify a port when you launch it). In a web browser go to 127.0.0.1:port. Go into the settings and tell it to save outputs to /outputs. Close SD by hitting ctrl-c in the terminal. Open a new terminal, and save a snapshot of your rocm-pytorch container as "stablediffusion". Now you can close down your open container. Write a bash script to start the docker daemon, save its PID into a variable, and then launch your docker container. Include an argument to bind mount /outputs in the container to a folder on your computer. The next line should be kill "$PID" to stop the daemon, and then chown -Rv you:users "/absolute path to your outputs folder" to automatically own your output files. One of the benefits of using docker is that if something breaks with an update, which happens every now and then, you can very easily roll back to a working install. Just don't release old snapshots until you're sure the updated one works well. Personally, I don't delete them at all. I could roll back all the way to September 2022, when I first played with SD, if I wanted.

Just ask me if you need more specifics, okay? We could maybe do it with PMs or if you want to start a new thread instead, since it's kind of offtopic for this thread.
I have tried. I found a tutorial which was supposed to "convert" models somehow. I don't recall it involving Docker at all. And whilst I sort of got it working the results were greatly disappointing and I then broke it with some sort of version issue.

I'm not a technical novice, but I don't really know how to get this working and just a few good links or the concepts would help. The tutorial I read didn't really explain everything and assumed you understood how SD and models all work. So when something went wrong, I couldn't figure it out.
That sounds like what you were actually running was the CPU version? Yeah, performance on that is going to be beyond awful. My 6900XT gets about 8it/s, which is decent, though still quite a bit lower than an Nvidia would do. Oh well. Your RDNA3 card will be much faster. You'll be able to do higher than 768x768 too, that's about where the limit for my card is.
 

At first I thought the barely increased L2 performance numbers sounded bunk, but looking up how Intel went from 1.25mb to 2mb on the 12900k vs the 13900k, it seems in line.

I'd be surprised if the L4 tile dealie they're talking about for Intels next arch ends up being added to the mainstream chips though, so I'd say the 3D-vcahe chips are still the winners for a while.
 
Not the chip. It's mobo makers pushing out voltages that are too high.
Board makes claim their mobos are operating to AMD's spec. And now AMD is pushing out new AGESA code to mitigate the problem.
"We have root caused the issue and have already distributed a new AGESA that puts measures in place on certain power rails on AM5 motherboards to prevent the CPU from operating beyond its specification limits, including a cap on SOC voltage at 1.3V. None of these changes affect the ability of our Ryzen 7000 Series processors to overclock memory using EXPO or XMP kits or boost performance using PBO technology.
We expect all of our ODM partners to release new BIOS for their AM5 boards over the next few days. We recommend all users to check their motherboard manufacturers website and update their BIOS to ensure their system has the most up to date software for their processor.
Anyone whose CPU may have been impacted by this issue should contact AMD customer support. Our customer service team is aware of the situation and prioritizing these cases."
 
Same, I've been using AMD shit for years and never had issues but I'm not exactly a power user and don't make a lot of "under the hood" changes. I will say that the radeon adrenalin software is kind of annoying to use and feels like it was programmed and designed by people who don't have to actually use it. Bloated and slow.
The AMD-ATI merger was a colossal cluster from what I understand.
It just reminds me of my cousin begging me to solve his issues on STALKER (Too bad Todd "It just werks" isn't in charge of GSC now) crashing on AMD cards and all I had to go off of was an empty stack trace. Told him to eat shit and get rid of his 5900.

That being said, I have always wanted a successor to the R9 Nano in RDNA form with HBM. Now THAT would be interesting.
 
  • Informative
Reactions: Brain Problems
Trying to understand why you want something isn't minimizing those wants. "It's cheaper" is as good a reason as any.
One reason was/is that, if you want to build a new computer but don't want to splurge for a GPU just yet. eg. A friend of mine running Win 7 has to update to Win 10 to continue using Steam, but Win 10 complains about his hardware. (I think he's on a gen 4/Haswell cpu.)

Another reason is that they can make a good cheapo home made machine for parents or as a gift for someone who likes retro games. I gave my brother a Raspberry Pi with a bunch of retro games on it and he loves it. Though as you say, there's mini-PCs that do the job just as well and for cheap. I'm tempted to get him something similar this year. Either a cheap mini-PC or one of these if the price is under £100.

Though if I'm giving him PS2 and GameCube games, I'll need a better storage solution for that.
 
One reason was/is that, if you want to build a new computer but don't want to splurge for a GPU just yet. eg. A friend of mine running Win 7 has to update to Win 10 to continue using Steam, but Win 10 complains about his hardware. (I think he's on a gen 4/Haswell cpu.)

Another reason is that they can make a good cheapo home made machine for parents or as a gift for someone who likes retro games. I gave my brother a Raspberry Pi with a bunch of retro games on it and he loves it. Though as you say, there's mini-PCs that do the job just as well and for cheap. I'm tempted to get him something similar this year. Either a cheap mini-PC or one of these if the price is under £100.

Though if I'm giving him PS2 and GameCube games, I'll need a better storage solution for that.
The video speculates about dual-channel RAM; it's not possible on Alder Lake-N chips. It seems like the increased performance and ability to use DDR5 or LPDDR5 makes up for it.

It should also be possible to force even Windows 11 to install on Sandy/Ivy/Haswell hardware. It may not be desirable, IDK.
 
Last edited:
@snov Reply limits prevent me quoting you. I didn't reply to your much appreciated post at the time because I was trying out your suggestions and experimenting to figure this out. Thanks for the pointers - they did indeed help. I realised from your post that I never made clear I was on Windows but that wasn't an issue. I could translate your instructions from GNU/Linux well enough. So following along I figured out a lot more about how all this hangs together - Python versions, what PyTorch is, etc. I did get it all working but it still reverted to CPU. What I've learned (or think I've learned) is that the RDNA 3 cards (i.e. 7900XT, 7900XTX) aren't supported by ROCM at the moment. Which based on my self-educated understanding, means I'm stuck with CPU rendering.

It's not all bad - I have Threadripper build with a tonne of RAM. It seems to parallelise pretty well from looking at the performance monitor and it's nice to have something that eats up 50+GB of RAM as the hardware hasn't been used to its full effect in some time. It seems to range between 8.5 s/it to 21 s/it. I don't fully understand those numbers and I know they are bad compared to what you listed but probably quite good for CPU building?

But ultimately, this is a little sad. Reading a few forums here and there it seems it took AMD a very long time after launch for them to add support for 6000 series so I'm not optimistic about support for my GPU being suddenly around the corner.

I don't like Nvidia as a company much and I like their GPU prices just as little. But I think if I really want to have some fun with this I should sell my 7900XT with its great but useless 20GB of VRAM, and get myself a 4080 if I can stump up the insane amount of money for one.

None of this takes away from how helpful your advice was, though. Got me past the wall I was blocked by.
 
@snov Reply limits prevent me quoting you. I didn't reply to your much appreciated post at the time because I was trying out your suggestions and experimenting to figure this out. Thanks for the pointers - they did indeed help. I realised from your post that I never made clear I was on Windows but that wasn't an issue. I could translate your instructions from GNU/Linux well enough. So following along I figured out a lot more about how all this hangs together - Python versions, what PyTorch is, etc. I did get it all working but it still reverted to CPU. What I've learned (or think I've learned) is that the RDNA 3 cards (i.e. 7900XT, 7900XTX) aren't supported by ROCM at the moment. Which based on my self-educated understanding, means I'm stuck with CPU rendering.

It's not all bad - I have Threadripper build with a tonne of RAM. It seems to parallelise pretty well from looking at the performance monitor and it's nice to have something that eats up 50+GB of RAM as the hardware hasn't been used to its full effect in some time. It seems to range between 8.5 s/it to 21 s/it. I don't fully understand those numbers and I know they are bad compared to what you listed but probably quite good for CPU building?

But ultimately, this is a little sad. Reading a few forums here and there it seems it took AMD a very long time after launch for them to add support for 6000 series so I'm not optimistic about support for my GPU being suddenly around the corner.

I don't like Nvidia as a company much and I like their GPU prices just as little. But I think if I really want to have some fun with this I should sell my 7900XT with its great but useless 20GB of VRAM, and get myself a 4080 if I can stump up the insane amount of money for one.

None of this takes away from how helpful your advice was, though. Got me past the wall I was blocked by.
I suppose those are good numbers for CPU, but yeah, they're not amazing. My midrange last gen GPU easily outclasses your workstation CPU despite its very limited memory. Some quick searching says the 6900XT had ROCm support by february 2021, and was released december 2020, so on that schedule your 7900XT should already have it.

You might want to give https://github.com/nod-ai/SHARK a try before you sell your GPU (unless you actually want to change to Nvidia, if you're on Windows I won't begrudge you that, there's lots of good reasons to prefer team green there). It's a Stable Diffusion built on Vulkan instead of PyTorch. I've never tried it so I can't really offer you any support or guidance, but it claims to support Windows so you can probably find some guides for it on reddit. /g/ also has some guides on setting up Stable Diffusion on Google's developer VPS, which apparently works and is free up to a point?
 
But ultimately, this is a little sad. Reading a few forums here and there it seems it took AMD a very long time after launch for them to add support for 6000 series so I'm not optimistic about support for my GPU being suddenly around the corner.
AMD has been consistently behind drivers forever on their shit. It's like how Cummins (Dodge) trucks get the rap of being constantly towed because transmission failures, GM having shit injectors, and Ford found dead in a ditch after the headgasket fails (or timing chain failure).
 
  • Feels
Reactions: Overly Serious
@snov Thanks. I'll check out Shark. I don't have a deep understanding of how this all works internally. If that is Stable Diffusion built on Vulkan instead of CUDA, will all models still work on it? Is a model / checkpoint entirely independent of the technology under it or can it have hardware specific requirements?

FWIW, I've been on and off considering ditching Windows for GNU/Linux for a while now. I like Windows 10 a lot but it's all downhill from there and they'll force me onto W11 with it's crappy MacOS inspired, cloud-focused crap sooner or later if I stay. What I'll probably do is build a NAS - planning to anyway - keep a Windows laptop around and maybe have the beast with the threadripper and the big GPU for the occasional gaming and more serious work. I'd reinstall everything on it and either just have GNU/Linux or a dual-boot if there were a reason, e.g. gaming.
 
@snov Thanks. I'll check out Shark. I don't have a deep understanding of how this all works internally. If that is Stable Diffusion built on Vulkan instead of CUDA, will all models still work on it? Is a model / checkpoint entirely independent of the technology under it or can it have hardware specific requirements?

FWIW, I've been on and off considering ditching Windows for GNU/Linux for a while now. I like Windows 10 a lot but it's all downhill from there and they'll force me onto W11 with it's crappy MacOS inspired, cloud-focused crap sooner or later if I stay. What I'll probably do is build a NAS - planning to anyway - keep a Windows laptop around and maybe have the beast with the threadripper and the big GPU for the occasional gaming and more serious work. I'd reinstall everything on it and either just have GNU/Linux or a dual-boot if there were a reason, e.g. gaming.
Sorry, I've no idea how Shark works. You'll have to read up on it on your own.

Instead of dualbooting, what you should do is a Windows virtual machine. Dualbooting is kind of obsolete imo.
Your Threadripper has plenty of cores to spare, and probably really good IOMMU groups since it'll be on a workstation board, so you can pass your graphics card through to the VM for whenever CPU rendering in the VM won't be enough (and CPU rendering has come a long way, right now I'm sitting with Solidworks on a 4k window, and getting solid 60 fps on an eight-core VM). Install a weak secondary GPU to use as your primary output device and you can keep the strong GPU passed through for games 24/7, Looking Glass is a special method of controlling a VM that uses a shared memory space to transfer frames from the VM to the host at any resolution and refresh rate the GPU can handle, with next to no latency. I have a 7950X, so I use the internal GPU on the host and pass through the dedicated one, and my looking glass has zero noticeable delay (though I'm not a very good gamer, your experience may differ). Mostly I use Windows' built-in RDP though, it does have noticeable latency and it compresses in such a way that details in red look bad, but it's entirely liveable. That's how I'm using Solidworks right now.
 
I've gotten some benchmarks done on the new 96-core EPYC and the 56-core Xeon Sapphire Rapids, and all I can say is, if you're looking to burn $10K on a CPU, it would be foolish to waste it on Intel this generation. Note that SPR was supposed to compete with second gen EPYC (Rome). It's very much a day late and a dollar short.

What I've learned (or think I've learned) is that the RDNA 3 cards (i.e. 7900XT, 7900XTX) aren't supported by ROCM at the moment. Which based on my self-educated understanding, means I'm stuck with CPU rendering.

I can give some insight into AMD vs NVIDIA, and why I would recommend switching to NVIDIA despite generally disliking them as a company prone to overhyping and control-freak behavior, and preferring AMD for my personal machines.

NVIDIA has been building their architecture ground-up around CUDA for around a decade now. Their consumer, workstation, and datacenter GPUs are all fundamentally built out of the same building blocks and can run the same software. Everything's backward compatible, everything's coherent. It's not unlike how if your binary will run on an Intel laptop CPU from 2008, it'll run just fine on a brand-new Xeon.

AMD isn't like that. They've been lurching from one architectural paradigm to another for years, and they made the very bad decision to commit to OpenCL, which is just an awful programming model. When you're working with AMD GPUs, it feels like nothing is compatible with anything, and nothing shares anything at all. Unlike NVIDIA, two GPUs in the same generation aren't binary-compatible, so you rely on JIT compiling a lot to run. This has resulted in lots of subtle bugs that only appear on one specific SKU. Slow, glitchy, problem-filled rollouts are something AMD has no way to navigate out of.

ROCm is supposed to fix all this, but we're 5+ years away from all the pre-ROCm GPUs filtering out of the market. Moreover, my experience with their ROCm/HIP ecosystem has been uniformly negative-compiler failures, documentation errors, build systems that were coded by monkeys, etc. This is a company that clearly has horrendously bad internal software practices. Unless you have a QA team that is paid to shoulder the work of iterating through every turd AMD is going to pinch off in your lap, I wouldn't rely on them to do big-kid work in AI/ML or simulations. I'd just stick with NVIDIA. Even if AMD starts today with reforming everything they do wrong, they're 5+ years from being able to release software that doesn't make me want to become a coal miner.
 
Thanks @snov (still can't quote you). One thing I'm not sure about with Linux vs. Windows is disk encryption. With Windows I have Secure Boot and BitLocker encryption on all of my drives. This is a big plus to me. I don't what the equivalents are on Linux these days but I want my system fully and solidly encrypted in a way that can't be bypassed if someone gets hold of the machine physically.
 
Thanks @snov (still can't quote you). One thing I'm not sure about with Linux vs. Windows is disk encryption. With Windows I have Secure Boot and BitLocker encryption on all of my drives. This is a big plus to me. I don't what the equivalents are on Linux these days but I want my system fully and solidly encrypted in a way that can't be bypassed if someone gets hold of the machine physically.
You can mark the text and click reply in the little window that pops up. @snov works too, I've gotten every notification so far.

I'd trust the Linux encryption more than Microsoft's for what it's worth. Bitlocker can be unlocked with your Microsoft Account, which means Microsoft can unlock it at will, which means all the government has to do is ask them. Linux asks you to enter a passphrase at boot (which you can set up to provide with a fingerprint reader or a USB key etc), and that unlocks the drive. The government would have to bother bringing a bucket of water and a towel to make you unlock your computer for them.

I'd recommend ZFS, it's my favourite file system in the world. It supports encryption, compression, snapshots, setting special properties for each dataset, and if you give it multiple drives it can do RAID and checksums to restore your files if they rot. It's also super easy to back up to a NAS or offsite server, with the send feature. It's delightful. One example of how you could do your pool would be four HDDs in RAID10, and two SSDs in RAID1 as your metadata special device. Set the special block size for your OS datasets to equal your record size and it'll store everything operating system related on the SSDs, making those files SSD-fast to access, while seamlessly giving you all the bulk storage capacity of your HDDs. You can also use ZFS datasets to store VMs, giving you access to the lovely ZFS features even on obsolete filesystems like NTFS.

I could talk about ZFS all day.
 
Thanks @snov (still can't quote you). One thing I'm not sure about with Linux vs. Windows is disk encryption. With Windows I have Secure Boot and BitLocker encryption on all of my drives. This is a big plus to me. I don't what the equivalents are on Linux these days but I want my system fully and solidly encrypted in a way that can't be bypassed if someone gets hold of the machine physically.
Linux would be LKUS (Linux Unified Key Setup). You have to support and kernel options for it alonside an initramfs. t LVM (Logical volume Management) is required if you have a /boot on the ESP or the bootloader does not support decryption. There is true/veracrypt/tcplay compatibility support but I do not know how good it is. Just remember that LUKS metadata shows your cipher type (Choice of every kernel cyrpto in the API* or write your own) , making it possible to brute force your password if it's shit.

*AES, Anubis, CAST5/6, Twofish, Serpent, Camellia, Blowfish, DES, ShangMi(1/2), IDEA, GOST (read-only), Elliptic Curve Diff-Hellman, EC and its variants, RSA, ElGamal (I think), DSA etc.
 
  • Informative
Reactions: Brain Problems
Linux would be LKUS (Linux Unified Key Setup). You have to support and kernel options for it alonside an initramfs. t LVM (Logical volume Management) is required if you have a /boot on the ESP or the bootloader does not support decryption. There is true/veracrypt/tcplay compatibility support but I do not know how good it is. Just remember that LUKS metadata shows your cipher type (Choice of every kernel cyrpto in the API* or write your own) , making it possible to brute force your password if it's shit.

*AES, Anubis, CAST5/6, Twofish, Serpent, Camellia, Blowfish, DES, ShangMi(1/2), IDEA, GOST (read-only), Elliptic Curve Diff-Hellman, EC and its variants, RSA, ElGamal (I think), DSA etc.
Do these work with TPM chips on board the motherboard or in the CPU? And what protections does GNU/Linux have on the bootstack? I've been out of the Linux game for quite a while and last time it came up it was Linux zealots declaring how Secure Boot was a gimmick and Linux didn't need it. That was quite off-putting to me at the time as it was around the time I switched from Linux to Windows, I recall thinking that the tables were turning and Windows was becoming the more secure OS now.

To be clear, I'm not saying Linux is less secure, I'm just asking what the situation is. And it's true that MS can unlock the drive with BitLocker but it's very effective against non-State actors who my drives might fall into the hands of.

@snov Your enthusiasm is catching. And like I say, I have been thinking about going back to my Linux roots (pre-System D days so I'm sure I'm useless on it now). But I was thinking of building my own NAS. Now I'm a little caught between options. My threadripper build is 1st gen. So it's pretty old. But at the same time it's threadripper so it could last me years more before I need more power (if I ever do). I hope to have more money soon and could do a full new build of something quite powerful if I really wanted. But not that AND the NAS.
 
@snov Your enthusiasm is catching. And like I say, I have been thinking about going back to my Linux roots (pre-System D days so I'm sure I'm useless on it now). But I was thinking of building my own NAS. Now I'm a little caught between options. My threadripper build is 1st gen. So it's pretty old. But at the same time it's threadripper so it could last me years more before I need more power (if I ever do). I hope to have more money soon and could do a full new build of something quite powerful if I really wanted. But not that AND the NAS.
Well, it would be either ZFS or the LUKS+LVM solution Canam suggests. They’re essentially mutually exclusive, ZFS wants direct control of the drives. LVM also does RAID, but it won’t checksum, and it also won’t backup as conveniently as ZFS, so if those things matter to you I’d still say ZFS. Linux won’t store keys in your TPM (but it will use it to boost its random number generator), you enter the keys manually at boot, but you can use Secureboot these days. Either go LTS and sign your kernel yourself, or use a separate bootloader that checksums your kernel before booting.
Secure Boot still isn’t actually secure though, Microsoft can sign any bootloader they like and you can be sure the CIA has several compromised ones, which means anyone willing to pay has compromised bootloaders. They would need access to the machine first though, and going to all that effort to steal your data is silly when they could just start breaking fingers until you unlock it for them.

LUKS is theoretically the more secure option, you can store your partition headers (the things that mention that algorithm is in use) on a flash drive you hide after booting the machine, but that’s for properly paranoid people who don’t mind losing all their data when a bit randomly flips on the flash drive in a couple of years.
 
I use TPM for heaps of stuff. It works fairly well but you kinda have to get down and dirty with how it actually works, last time I checked there aren't really "ready-made" solutions. (Might differ by distro) My encryption scheme has an easy pin that's bound to the machine via TPM (so it can't be bruteforced, also easy to invalidate if something's fucky) and a more elaborate password that's machine independent but not realistically open to attack. I also use it for SSH keys and to mark a remote machine compromised in a way where an intruder could not remove that mark after the deed. It's all more a because I can thing really though, I could live without it.

What people often don't get about Linux when they cite the learning curve and complexity is that when you actually know what you're doing and implement everything to your specification, it just works. No need to adapt to the newest OS paradigm every two years because some people want to secure their jobs. You set it up once from the ground up and it's yours. I have the feeling if I would've stuck to windows back in the day, in total hours It would've probably caused me more work (and headaches) than Linux ever did.

I personally use btrfs and it once saved me from/drew my attention to buggy intel nvme firmware presenting files in a corrupted way under certain circumstances. When I researched file systems I got the impression a lot of the bad rep btrfs gets is from people blaming hardware failure on the file system because it had the audacity to error out instead of just silently corrupt. I can't and do not want to speak for the early btrfs days though.
 
  • Like
Reactions: Overly Serious
Back