Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
So I've spent about twenty minutes with ComfyUI and I don't think I'm ever going back to Automatic1111. I don't know why the performance difference is so striking as I thought the difference was primarily UI. I was motivated to try this new "ComfyUI" I saw a reference to because trying out SDXL with 1024x1024 the speed of generating them compared to previous models at 512x512 absolutely tanked. From maybe 30s to 7min+. Switching to ComfyUI has restored things to fast turn-around. This is on an AMD RDNA3 card, btw. As AMD continue to increase support for that I hope it to improve further.

Some people have said ComfyUI's downside is greater complexity. I don't find it so at all. In fact I find it simpler than Automatic1111.

Also, I think it will be easier to script large batches of images to run when I try running it on some paid GPU instance in the Cloud. Which I intend to do as soon as I (a) have time and (b) and certain I wont accidentally run it for a month and rack up a couple of thousand pounds bill.
 
So I've spent about twenty minutes with ComfyUI and I don't think I'm ever going back to Automatic1111. I don't know why the performance difference is so striking as I thought the difference was primarily UI. I was motivated to try this new "ComfyUI" I saw a reference to because trying out SDXL with 1024x1024 the speed of generating them compared to previous models at 512x512 absolutely tanked. From maybe 30s to 7min+. Switching to ComfyUI has restored things to fast turn-around. This is on an AMD RDNA3 card, btw. As AMD continue to increase support for that I hope it to improve further.

Some people have said ComfyUI's downside is greater complexity. I don't find it so at all. In fact I find it simpler than Automatic1111.

Also, I think it will be easier to script large batches of images to run when I try running it on some paid GPU instance in the Cloud. Which I intend to do as soon as I (a) have time and (b) and certain I wont accidentally run it for a month and rack up a couple of thousand pounds bill.
I doubt you would even rack up more than $700 on accident if you left it on for a month. A 4090 runs for around .65 an hour if you use lambda or runpod. Plus they use credits on runpod , so you pay up front. I used to be worried as well lol.
 
I'm not using SDXL just yet, but I was browsing to see what checkpoints and LoRAs there were on CivitAI and I found some dude had taken pictures of his own dick and turned it into a ~2 GB LoRA. God bless.

🙈🍆 My Penis 🍆🙈 ( Archive can't access logged-in only content ; obviously NSFW )
Maybe he gets a kick out of how many virtual girls his virtual dick is going to be inside. I was earlier going to ask how many pictures of something you need in order to train a model but I think now I'm happy not knowing.
If AMD can get parity with Nvidia in this field then that would suit me very fine. Though I'll never get back the hours it took me to get everything working on AMD. Upscaling is still shaky though I'm optimistic I can get it working right in ComfyUI whereas it wouldn't in Automatic1111.

For home users if they can get it to parity with Nvidia then that puts them in a lead really because they're already better value for gaming, imo. It's AI and some of the software features they fall behind in. Someone in the comments makes an interesting point about how SDXL requires more VRAM and that as this trend is towards needing more VRAM, AMD has an edge. With the 7900 cards having either 20GB or 24GB VRAM and 4080 topping out at 16GB could work out well.

I think it will be a good long time before AMD can fully compete with Nvidia in terms of convenience and some nice to have features like that AI noise-dampening. We'll see though. Good find.
 
Someone in the comments makes an interesting point about how SDXL requires more VRAM and that as this trend is towards needing more VRAM, AMD has an edge. With the 7900 cards having either 20GB or 24GB VRAM and 4080 topping out at 16GB could work out well.
You need around 10GB-12GB to generate images in SDXL. The 24GB club is for if you are training LORAs. A used 3060 12GB at $250 is fine for the average user trying to run sdxl. Nvidia has a problem with the secondary market, and using GPUs as a money printer. The cards still being sold off from the 2020 crypto run have been disastrous for new sales of consumer cards, and it was probably one of the things that caused evga to drop out of the market. It's good to see that AMD is aware of the secondary Nvidia consumer market and is adjusting prices and the specs on its products to compete in the midrange consumer AI space, as opposed to Nvidia who is just upping performance and pushing out expensive SKUs.
 
You need around 10GB-12GB to generate images in SDXL. The 24GB club is for if you are training LORAs. A used 3060 12GB at $250 is fine for the average user trying to run sdxl. Nvidia has a problem with the secondary market, and using GPUs as a money printer. The cards still being sold off from the 2020 crypto run have been disastrous for new sales of consumer cards, and it was probably one of the things that caused evga to drop out of the market. It's good to see that AMD is aware of the secondary Nvidia consumer market and is adjusting prices and the specs on its products to compete in the midrange consumer AI space, as opposed to Nvidia who is just upping performance and pushing out expensive SKUs.
Well sounds like training Loras is out for me. But that's okay. I think my next step is trying to generate images on some rented GPU resource so if I start to figure out training models and loras it'll be a natural progression to do it there. I'm actually half-thinking I might have made a mistake in buying my new GPU as cost wise I'd have to do a Hell of a lot of local image generation to make it cheaper than just renting. But... it is nice for the occasional gaming I do and my Radeon 480 was old enough that I felt justified with a three generation leap. I'd given up waiting for prices to come down.
 
For home users if they can get it to parity with Nvidia then that puts them in a lead really because they're already better value for gaming, imo. It's AI and some of the software features they fall behind in.
AMD is finally getting its shit together with ROCm's recent release. Now with actual Windows support! They're still playing catchup, but hopefully they'll be an actual competitor soon.
 
I've made a great amount of progress with my LORA in 1024x1024. Not sure if it is NSFW, but I'm going to spoil it any way.

00015-3056903936.png

Faces and hands are still hard and weird to do.

Edit: fun fact the original stable diffusion was trained on cropped 512:512 images. This resulted in the upper body and arms being excluded from alot of the training images in favor of the torso. It's only now that in SDXL with scaling and filling that we are starting to see improvements.
 
Last edited:
I've made a great amount of progress with my LORA in 1024x1024. Not sure if it is NSFW, but I'm going to spoil it any way.


Faces and hands are still hard and weird to do.
You poor boy. Whoever allowed you on the internet made a mistake.
 
You poor boy. Whoever allowed you on the internet made a mistake.
Oh def man. I'm actually a major advocate for keeping kids off of the Internet during their formative years. I'm fucked, but the important lesson anyone can learn from this is that you can channel what's wrong with you into something productive. I've learned a shitload about ML learning from this. I'm actually thinking about switching my career to image processing of 3 dimensional environments. It's always been a dream of mine to help people at jobs that can't be replaced, like manufacturing of vehicle chassis or things needing HODs. I really think that SD is the entry point for a generation of ML experts. How they got there isn't important at this point, so long as their contributions are positive.
 
So for people still interested in Loras. I think I found the sweet spot it's around 100-150 images. My previous example above was generated using 60 images of a combination of males and females. For some strange reason it's incapable of producing images of men;however every 1/20 times it can produce an image that vaguely passes as a woman(by that I mean barely human for that matter) which can be improved by passing through img2img recursively.

I've trained a new Lora on another artist with 109 images of exclusively females and the results are far more consistent. In the past I've effectively used image sets of up to a 500 images which were then combined with a mirror image and used to produce a decent Hypernetwork. Although those images were significantly dissimilar. If you are interested in training on style, I would recommend using Kohya with AI_Now prodigy.

If you want to you can lower the max epochs down to 60. I've never had a Lora generated that isn't cooked after 40.
 
Last edited:
So for people still interested in Loras. I think I found the sweet spot it's around 100-150 images. My previous example above was generated using 60 images of a combination of males and females. For some strange reason it's incapable of producing images of men;however every 1/20 times it can produce an image that vaguely passes as a woman(by that I mean barely human for that matter) which can be improved by passing through img2img recursively.
cfg_rescale_webui is useful if you overtrain a model or a lora
 
I like how the AI embraces "WE WUZ KANGZ".

That's because whoever did it was a we wuz retard else he would've loaded LoRAs for the correct number of fingers in hands and pyramids that don't bend at the top.
1693457500188.png
I take its stuff from different authors since some characters do look like what actual egyptians looked like, but whoever put the black guys with dreadlocks should be taken back and shot, the only people wearing dreadlocks at the time were...............wait for it.............the spartans. They were also well armored, nobody goes to war with nothing but a helmet and a cape as protection.
1080ti, 8GB VRAM.... I need an upgrade so badly I can taste it... Especially since I'm doing text stuff now.
What kind of text stuff?

BTW still waiting for that super realistic (can't tell is real) people done with AI. So far all I seen only works at really low res, even at 1024*1024 you can see the "traces".
The cards still being sold off from the 2020 crypto run have been disastrous for new sales of consumer cards,
Not enough to get nvidia to low their insane prices, plus the problem with used mining cards is that a lot of people who got into that didn't know shit about PCs let alone crypto, they bought a ready-to-use miner and plugged it, nothing else. I've met these people IRL, one owns a pizza place. So these cards are beaten to fuck, everybody says "nah miner cards are alright because its an investment" yadda-yadda but again these people don't know shit about hardware, they wouldn't clean the cards, one broad told them she was selling her rig whole because he was too dumb to figure out how to take the cards out.

A friend got an ex-mining 3080ti recently, sure came with the original box and everything but the miner "forgot" the foam to keep the card in place, plus it was filthy. And of course once he installed it it didn't work, no idea if it was a lemon or broke during shipping but the seller can suck it because my buddy got a refund and he got his now broken and worthless card back.
It's good to see that AMD is aware of the secondary Nvidia consumer market and is adjusting prices
Yeah but AMD still sucks for SD and AI in general doesn't it? like a 6700XT scores more or less the same than a 3070 but the SD benchmarks I seen it was way further below.
 
Last edited:
Need that 3060 or 3080 12GB VRAM baby.
My 2060 12gb is just enough to scrape by, though finding the right settings to get SD XL not to crash was a little tricky. It kills me to have a 6900xt in my Windows machine that I can't use effectively.
 
My 2060 12gb is just enough to scrape by, though finding the right settings to get SD XL not to crash was a little tricky. It kills me to have a 6900xt in my Windows machine that I can't use effectively.
-medvram in automatic is your best bet on a sub or 12GB card for SDXL and has been what most people with 8GB have been running on up to SD 2.0 when doing 768 gen. I run SDXL on a headless box, without Medvram I'm looking at 11.8 GB GPU Mem usage on a 1024 text2img. with it, I'm down to around 8.5GB and can do IMG2IMG. I feel bad for people trying to run SDXL on non specific cards or midrange cards.
 
Last edited:
Back