Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

I've not worked on SD since XL came out. Is there a way to make it run on more potato-y machines? I'm having out of memory errors at the last iteration every time. Or am I just doing something wrong?
Try messing with your VAE related settings. In SD.Next, mine used to bomb out whenever it would try to move the base model to the CPU when using the VAE.
 
  • Informative
Reactions: Puff
I've been out of the loop for a while, what's the current take on SD vs. DALL-E? Playing with Bing, I can tell DALL-E is better at having characters interacting with each other than SD was back when I used that a lot, but I haven't used any models since 1.3, has SD also improved?
 
  • Agree
Reactions: The Last Stand
By the way, if you've noticed that Bing Image Creator now says it's part of Microsoft Designer, they've got an AI-assisted image editor. It's got a background remover, various filters and templates, icons and stock graphics, plus you can upload images (and fonts if you don't like its selection). As you muck around with something, it will start spitting out variations that you can click on and refine.
I've heard of Microsoft Designer but never used it. Let me try.

1700610127602.png

Yeah, this is subpar.
 
SD vs. DALL-E
Honestly, people are treating the new Dall-E as the second coming but I've not really seen anything that *really* impressed me. Also you have literally no control over what Dall-E will actually generate in the end and most generations look kinda same-y after you've seen the first few dozen. I haven't used it myself but got the impression from this thread that for serious workflows where you actually want something specific in the end, it's wholly unsuited. Seems more like an AI toy to be honest.

A lot has happened to SD since 1.3. I advice to give current developments and ComfyUI a try.
 
  • Optimistic
Reactions: The Last Stand
Honestly, people are treating the new Dall-E as the second coming but I've not really seen anything that *really* impressed me. Also you have literally no control over what Dall-E will actually generate in the end and most generations look kinda same-y after you've seen the first few dozen. I haven't used it myself but got the impression from this thread that for serious workflows where you actually want something specific in the end, it's wholly unsuited. Seems more like an AI toy to be honest.

A lot has happened to SD since 1.3. I advice to give current developments and ComfyUI a try.
Yeah, I can't wait! I've been using a 6900XT, but haven't updated SD in ages since whenever I do I have to fix the docker container for ROCM-pytorch, but I recently got a great deal on a 4090 so once that arrives I'll get back up to date with SD.
I've googled ComfyUI, it looks super confusing. I might rather just stick with automatic1111 if that's still viable?
 
  • Like
Reactions: inception_state
Yeah, I can't wait! I've been using a 6900XT, but haven't updated SD in ages since whenever I do I have to fix the docker container for ROCM-pytorch, but I recently got a great deal on a 4090 so once that arrives I'll get back up to date with SD.
I've googled ComfyUI, it looks super confusing. I might rather just stick with automatic1111 if that's still viable?
It mostly comes down to UI preference, they both support SD and SDXL with similar generation speeds. The main thing is that it's easier to just throw some prompts into A1111 and get started, while Comfy has some more advanced tools for building complex automated multi-step workflows.

Honestly, people are treating the new Dall-E as the second coming but I've not really seen anything that *really* impressed me. Also you have literally no control over what Dall-E will actually generate in the end and most generations look kinda same-y after you've seen the first few dozen.
Yeah it starts looking samey after a while. The main advance there was basic, natural language prompts generating results with multiple objects/concepts composed in a reasonable way. Eg. we were messing around in the bossmanjack chat earlier, and with Dall-E I prompted "a cartoon rat pointing a rifle at a carton of juice tied up on a chair, cartoon style" and got this:

OIG (12).jpeg OIG (13).jpeg OIG (11).jpeg
Pretty reasonable, and basically what I was expecting. In one the rat is on the chair, but at least it has all the major elements.

Now SDXL with the same prompt:
15133-1632060040-a cartoon rat pointing a rifle at a carton of juice tied up on a chair, carto...jpg 15134-1632060041-a cartoon rat pointing a rifle at a carton of juice tied up on a chair, carto...jpg 15135-1632060042-a cartoon rat pointing a rifle at a carton of juice tied up on a chair, carto...jpg
Most of the pieces are there, a bit deformed, but the relationship between the objects is wrong.
 
Last edited:
Just started messing around with SD this week. Using ComfyUI and coming in with zero prior knowledge I've found it super easy to just pick up and start using.. If there's something I don't understand, I go find something on Youtube or Reddit that will explain it.
I'm not doing anything particularly groundbreaking, but it's been entertaining just yoinking the information from stuff that catches my eye on CivitAI and running it through different models or altering the prompts slightly.


Glass fruit:
ComfyUI_02402_.pngComfyUI_01123_.png

Cats made out of food:
ComfyUI_01442_.pngComfyUI_01543_.png

Gundam-esque Mechas:
ComfyUI_02237_.pngComfyUI_01963_.png
 
The only thing I'm not liking about ComfyUI so far is that I miss the one click image-to-image transfer and the inpaint isn't very intuitive. I'll probably figure it out though. For basic generation it's easy and significantly quicker than Automatic1111
 
The only thing I'm not liking about ComfyUI so far is that I miss the one click image-to-image transfer and the inpaint isn't very intuitive. I'll probably figure it out though. For basic generation it's easy and significantly quicker than Automatic1111
I get basically identical benchmarks between the two. Eg. I did a batch of four 1024x1024 SDXL images with a basic prompt, 30 euler sampling steps, and it was 27 seconds with Comfy and 28 seconds with A1111.

It makes sense since they are running the same code under the hood, using the same libraries. You can use the same Python virtual environment for both. So really there shouldn't be significant differences unless one UI really screwed something up. Comfy had an edge for a bit with SDXL, but A1111 got patched in September to fix the SDXL issues that were slowing it down.
 
I get basically identical benchmarks between the two. Eg. I did a batch of four 1024x1024 SDXL images with a basic prompt, 30 euler sampling steps, and it was 27 seconds with Comfy and 28 seconds with A1111.

It makes sense since they are running the same code under the hood, using the same libraries. You can use the same Python virtual environment for both. So really there shouldn't be significant differences unless one UI really screwed something up. Comfy had an edge for a bit with SDXL, but A1111 got patched in September to fix the SDXL issues that were slowing it down.
XL Just OOM'd on my machine on A1111. The older SD 1 and 2 are the only thing I can really give insight on, but I was getting 0.95 - 1.15 seconds per iteration on A1111 and I'm pushing 0.35 or so on Comfy. It is worth noting that my internet is shit so I skimp on updates and I had to pull a driver update to even get Comfy running. If the UI's are actually different, it's going to be because I'm running an older graphics card and something isn't being supported in A1111.
 
  • Like
Reactions: inception_state
XL Just OOM'd on my machine on A1111. The older SD 1 and 2 are the only thing I can really give insight on, but I was getting 0.95 - 1.15 seconds per iteration on A1111 and I'm pushing 0.35 or so on Comfy. It is worth noting that my internet is shit so I skimp on updates and I had to pull a driver update to even get Comfy running. If the UI's are actually different, it's going to be because I'm running an older graphics card and something isn't being supported in A1111.
That would likely be the issue. Do a git pull and then reset your python env for A1111 overnight or something if you want to get it working more quickly, but if Comfy is working well for you, no need I guess.

Edit: Chris-chan style Bossmanjack, the fusion no one asked for.
15149-1524970348-_lora_bossmanjack_v6_0.9_, b0ssman, 1boy, facial hair, holding rat, solo _lor...png 15150-1524970349-_lora_bossmanjack_v6_0.9_, b0ssman, 1boy, facial hair, holding rat, solo _lor...png 15151-1524970350-_lora_bossmanjack_v6_0.9_, b0ssman, 1boy, facial hair, holding rat, solo _lor...png 15148-1524970347-_lora_bossmanjack_v6_0.9_, b0ssman, 1boy, facial hair, holding rat, solo _lor...png
 
Last edited:
SDXL's are way different to prompt than the old models... at least the one I started with. They seem more tied to a certain type of image than the last model format. Or it's just the one I picked up. Expanding now, ever so slowly on junglenet.
Kiwi attempts (using an example prompt from the model for quality)

ComfyUI_00012_.pngComfyUI_00014_.pngComfyUI_00015_.pngComfyUI_00021_.pngComfyUI_00022_.png
 
Most of the pieces are there, a bit deformed, but the relationship between the objects is wrong.
For me, this is the next "big breakthrough" I'm looking forward to with local models. Yeah, there are ways around SD/XL's inability to compose in accordance with spatially-precise instructions using impainting & regional prompting extensions, but this sort of effortless understanding of prepositions makes me envious.
 
  • Like
Reactions: inception_state
SDXL's are way different to prompt than the old models... at least the one I started with. They seem more tied to a certain type of image than the last model format. Or it's just the one I picked up. Expanding now, ever so slowly on junglenet.
I can get SDXL to do kiwi, but sometimes I have to sneak up on it. "kiwi bird watching the ocean" negative: "green fruit"
SNEED8.jpgSNEED7.jpg
 
New AI animation just dropped, but before I show it, let's talk about AI animations during early 2023. This used to be peak AI animation:
View attachment 5508681
Clothes, face and body were inconsistent, they had to use a normal background because it wasn't suitable and the animation was good for its time, but it looks rough nowadays. I forgot when this was created, but this video blew up somewhere on March 2023.

Now let's forward to April 14, 2023 when this video blew up on Twitter:
View attachment 5508735
Nothing in this video is still consistent, but there's one improvement for sure and that's the animation, it's smooth as butter.

Now let's jump to the present, which is the 11th of November, and take a look at this:
View attachment 5508671
Background is still inconsistent and the hair too (though it's not that noticeable), but the clothes, face and body don't change like it's a time-lapse anymore. The person who made this is cfsstudios, and he uploads stuff like these on TikTok. The improvement of AI animation in the past few months is astonishing. It went from constantly changing pixlr image per frame to this.

Anyway, here's another video they made, it's similar to this but day time; it's less consistent, but still pretty good:
View attachment 5508809
Any leads on what sort of configuration was used to generate this? I've played with a bunch of the animation modules and settings and can never get any real consistency, let alone anything near these.
 
  • Like
Reactions: I'm a Silly
Back