Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

  • 🔧 At about Midnight EST I am going to completely fuck up the site trying to fix something.
Sorry if this is an obvious question, but does anyone know a service that would let me upload an image and "expand" it in a matching style or create a matching background? For example, going from this:

View attachment 5126183

to this:

View attachment 5126184

Is it possible?
Like Post Reply said, you're thinking of outpainting. There is an online limited free service that allows you to do it, but if your computer is able to run Stable Diffusion decently, you might as well install it and just do it yourself.

 
A friend wanted me to make some artwork for her tabletop book. The theme's like jungle expedition and what have you.

sample_51360-0.png

This model's unique in the fact that I wound up training SD on 200 epochs first with a dataset of 1500 illustrations of characters wearing pelts and other fantasy-appropriate garb. Once that training was done, I did another 15 epochs with a dataset of 2500 generated images using Terese Nielsen's artwork to get the art style down. The downside is even with an A100 GPU the model took around 18 hours to train, but I'm rather pleased with the results.
 
Apologies if this video has been posted already, and I didn't create this, but I think this deserves to be shared:


Like this is legitimately amazing. This is a month old, but just a couple of months before this video was made, you wouldn't have been able to do this and have it look this nice. Every frame of animation would have looked different, like it was switching from character to character each frame of the video. But now every frame looks like exactly the same character each and every time. Its basically a new form of rotoscoping, and it looks better than most examples of rotoscoping. To compare, here is and example of what is probably the best form of modern rotoscoping there is:


That entire dance was animated using traditional digital animation drawn over a video of woman dancing while dressed as the character. Despite the uncanny effect of a big anime head stuck on the realistic proportioned body of a woman, the actual animation is extremely smooth. Stable diffusion isn't too far from achieving that level of smoothness its honestly pretty impressive.
 
Apologies if this video has been posted already, and I didn't create this, but I think this deserves to be shared:


Like this is legitimately amazing. This is a month old, but just a couple of months before this video was made, you wouldn't have been able to do this and have it look this nice. Every frame of animation would have looked different, like it was switching from character to character each frame of the video. But now every frame looks like exactly the same character each and every time. Its basically a new form of rotoscoping, and it looks better than most examples of rotoscoping.
It has notable clothing consistency issues which is something the Corridor Digital crew managed to mostly avoid by training their model to draw a specific unique person in a specific outfit. So in theory, at this point in time the software is capable of doing rotoscoping roughly on par with the Chika dance though there would likely still be minor inconsistencies with larger or sudden movements.
 
I finally decided to bite the bullet and get this working last night. Really simple prompt of a Zombie apocalypse in the American civil war yielded this:
View attachment 5115558
Probably the best 1920x1080 image I've gotten out of this thing so far. Generating images of this size creates melted hellscapes, but it actually fits for this specific prompt.
Converting it to jpg killed the meta data, so here's the actual prompt: "Zombie apocalypse, American civil war, 1860s". There are no negative prompts and everything else is set to default.
View attachment 5115559
Sorry, did I say trad? I meant to say tard.
I recommend to create smaller resolution images first and then upscale them
 
  • Agree
Reactions: Figger Naggot
I've had weird problems with the automatic1111 webui and CUDA, so I've been searching for a decent fork and found this one: https://github.com/vladmandic/automatic

CUDA just works consistently and performance for non-upscaled images is notably faster. The UniPC sampler in particular gives me like 8-9it/s, vs the 2.5-3it/s I got with Euler a in the 1111 webui. That said, I don't have any anime girls to share. Just the stable diffusion equivalent of celtic notebook doodles.

00014-2038310686.cleaned.jpg00021-530902607.cleaned.jpg
 
Haven't checked out AI stuff since NovelAI was leaked, how bad is the GPU strain nowadays? I want to try it out but I'm a laptop scrub and I'm worried about damaging my gpu by overheating it & couldn't find anything recent about it.
 
Haven't checked out AI stuff since NovelAI was leaked, how bad is the GPU strain nowadays? I want to try it out but I'm a laptop scrub and I'm worried about damaging my gpu by overheating it & couldn't find anything recent about it.
I'm genning on my laptop and it's still okay. If you're concerned about memory usage there are parameters you can add to the .bat to reduce memory usage.
 
I've been getting back into proompting again now learning that you can get much better models and you can use LoRA and all kinds of cool stuff, but the issue is that I'm still on my 1060 6GB. Generating a single 768x768 image takes me three and a half minutes on Automatic1111's Stable Diffusion WebUI and 2 minutes on vladmandic's SD.Next.

I'm thinking about getting something like a used 3070, both to get better gaming performance and get better Stable Diffusion performance without having to buy a new PSU, a used 3070 seems the most reasonable choice in my case. What kind of increase in performance could I expect if I got one and tried to generate the same image?
 
I've been getting back into proompting again now learning that you can get much better models and you can use LoRA and all kinds of cool stuff, but the issue is that I'm still on my 1060 6GB. Generating a single 768x768 image takes me three and a half minutes on Automatic1111's Stable Diffusion WebUI and 2 minutes on vladmandic's SD.Next.

I'm thinking about getting something like a used 3070, both to get better gaming performance and get better Stable Diffusion performance without having to buy a new PSU, a used 3070 seems the most reasonable choice in my case. What kind of increase in performance could I expect if I got one and tried to generate the same image?
It really depends on other parameters like the sampler you're using, number of sampling steps, and potential hires fix. I have an NVIDIA RTX 3070 Laptop chip that can generate an image of your dimensions in about 10 seconds, assuming Euler a as the sampler, approximately 30 sampling steps, without hires fix.

But yeah, CivitAI is a cornucopia of checkpoints and LoRAs. If you haven't, consider checking out ControlNet because it can do things like detect hard edges, depth, and even generate images with provided poses.
 
I've been getting back into proompting again now learning that you can get much better models and you can use LoRA and all kinds of cool stuff, but the issue is that I'm still on my 1060 6GB. Generating a single 768x768 image takes me three and a half minutes on Automatic1111's Stable Diffusion WebUI and 2 minutes on vladmandic's SD.Next.

I'm thinking about getting something like a used 3070, both to get better gaming performance and get better Stable Diffusion performance without having to buy a new PSU, a used 3070 seems the most reasonable choice in my case. What kind of increase in performance could I expect if I got one and tried to generate the same image?
go with a used 3060 with 12gb of vram. less performance technically but due to the increased vram size, its still faster for smaller sizes and can handle larger image sizes much better
 
go with a used 3060 with 12gb of vram. less performance technically but due to the increased vram size, its still faster for smaller sizes and can handle larger image sizes much better
If I can generate multiple images in the time it takes me to generate one right now I'm happy with sacrificing Stable Diffusion VRAM related performance if I get better performance in other applications. Just how much more beneficial that extra 4GB in the 3060 is in Stable Diffusion compared to the 3070 anyway? If it's a few seconds of generation time I don't really care if I cut it down to seconds overall and get better gaming performance outside of SD.
 
Yeah, the video stuff is obviously fake and wouldn't fool anyone into thinking it's real. Meanwhile, ai images indistinguishable from real photos made using consumer grade hardware have been out for at least a year now.
Not really indistinguishable, and I'm not talking about the fingers but how when you get a highres version of the fake pic it all looks like drawn, it has no texture, it only fools some when its a low res pic so you cant see the details because there isnt any
 
Not really indistinguishable, and I'm not talking about the fingers but how when you get a highres version of the fake pic it all looks like drawn, it has no texture, it only fools some when its a low res pic so you cant see the details because there isnt any
You've only seen images created by non-photorealistic models.
 
  • Agree
Reactions: Baraadmirer
Okay, can someone tell me what exactly are the biggest difference between an RTX 3060 and an RTX 3070 in Stable Diffusion? The only benefit the 3060 has is the 12GB of VRAM, but besides that the VRAM has smaller bandwidth, it has less Tensor cores, and it is simply weaker overall. What do I really gain from those 12GB of VRAM? Can I do something that I wouldn't be able to do on the 8GB of VRAM? The speed improvement is counted in seconds?

Because I really do not see the point in getting a weaker GPU just because it has more VRAM. I want to see the real difference in numbers, and then compare it to my 1060 to see if there's even a point.
 
Okay, can someone tell me what exactly are the biggest difference between an RTX 3060 and an RTX 3070 in Stable Diffusion? The only benefit the 3060 has is the 12GB of VRAM, but besides that the VRAM has smaller bandwidth, it has less Tensor cores, and it is simply weaker overall. What do I really gain from those 12GB of VRAM? Can I do something that I wouldn't be able to do on the 8GB of VRAM? The speed improvement is counted in seconds?

Because I really do not see the point in getting a weaker GPU just because it has more VRAM. I want to see the real difference in numbers, and then compare it to my 1060 to see if there's even a point.
Mostly, speed of image generation and the maximum batch size/image resolution you can create without running out of VRAM and making the entire process grind to a halt.

I can really only speak from my own experiences (using a 1080), but with 8 GB of VRAM on automatic1111 UI with CodeFormers you're pretty much limited to a maximum image size of roughly 1080p (1920x1080). It may be (and probably is) possible to render images at a slightly higher resolution, but that's the size I've tested and found will work using any model 100% of the time. So, with 12 GB instead you'll be able to create larger images and batches. How large? No idea. I only create high resolutions images one at a time and never run a batch of small images larger than 4.

As for speed of the two cards, check the chart here. The 3070 gets about a third speed increase over the 3060.
 
Back