Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

That happens sometimes for me too, and I feel like it occurs more often when using a horizontal aspect ratio though that may be entirely unrelated. I haven't really done any sort of troubleshooting as it occurs roughly once per twenty batches, so I just shrug and assume an algorithm fucked up somewhere. If you find one of the prompts/seeds where it does that and get it to output results step by step, might be able to see exactly when it's screwing up. Not sure if it's something you'd be able to fix, but at least you might be able to see what's going wrong.
I think I might have figured out most of the problem: it could be that the output dimensions exceed the input's, which causes the error. I resized my output to be at most my input and it comes out fine, and it's great that Automatic1111 comes with a built-in upscaler to resize it to whatever you want.
 
I think I might have figured out most of the problem: it could be that the output dimensions exceed the input's, which causes the error.
I think this is something people commonly forget, but these models have a resolution they are trained at, and if you deviate from that resolution the quality of the composition (i.e. the image making sense overall and not having weird elements inserted in strange places) declines pretty fast. What you want to do is output at the standard resolution (512x512 for Stable Diffusion 1.4, not sure for other ones but it's probably the same) and then upscale after. It makes iterating faster anyway, as you only have to upscale the images you actually want at the end.
 
I think this is something people commonly forget, but these models have a resolution they are trained at, and if you deviate from that resolution the quality of the composition (i.e. the image making sense overall and not having weird elements inserted in strange places) declines pretty fast. What you want to do is output at the standard resolution (512x512 for Stable Diffusion 1.4, not sure for other ones but it's probably the same) and then upscale after. It makes iterating faster anyway, as you only have to upscale the images you actually want at the end.
Do you know if there's a way to quickly adjust dimensions to keep the aspect ratio? I'm stuck manually readjusting every time to make sure my subject doesn't come out as squished for img2img.
 
I wish. WebUI's stream of features is amazing but I wish they'd address workflow stuff like that. Resuming with settings is an extra step, some slider values are hidden at lower window sizes, moving prompts between tabs is also a hassle and populating from the image history/browser has always been at least half-borked.

I think this is something people commonly forget, but these models have a resolution they are trained at, and if you deviate from that resolution the quality of the composition (i.e. the image making sense overall and not having weird elements inserted in strange places) declines pretty fast. What you want to do is output at the standard resolution (512x512 for Stable Diffusion 1.4, not sure for other ones but it's probably the same) and then upscale after. It makes iterating faster anyway, as you only have to upscale the images you actually want at the end.
That's what hires fix is for btw. afaik it generates at standard res then uses that as a noise base to img2img up to the desired res or something to keep at least the high-level composition consistent.

It ain't that though since I think we're talking about img2img outputting black canvases. No-half/no-half-vae is a good tip (btw the VRAM impact is just regarding loading the model so it's not the end of the world which is lucky because you need it for older cards) but you might need --precision full too (or try just that?).
 
Last edited:
Don't know if this is old new by now, but dezgo.com is an in-browser AI image generator powered by Stable Diffusion 1.5. I don't know how the experience of using it compares to you much more technologically-capable chaps have been up to, but I'm just excited to be able to make Shreks.

download (11).png
"Shrek garden gnome" He is considering a scheme I have proposed to him.
download (19).pngdownload (20).pngdownload (21).pngdownload (22).png
"Zdzisław Beksiński shrek" Makes me want to start drinking again just so I can have him appear in a withdrawal-fuelled fever dream where I try to run from him while shaking and sweating.
 
Don't know if this is old new by now, but dezgo.com is an in-browser AI image generator powered by Stable Diffusion 1.5.
Gave it a spin. It's great for people who have sucky GPUs and it doesn't seem to operate on a currency system, though it doesn't have some of the features in Automatic1111's web UI, like generating a batch of output images or inpainting.
 
Get a paperspace account, pay ~$8 for Pro a month and use the free-tier GPUs together with the web UI. You get a lot of usage with a very fast GPU that's almost always available. I was thinking about getting a fitting GPU, but I bought an AMD GPU recently and I'm not playing SD nearly often enough to make it really worth it, as I barely play videogames that can't run on my iGPU otherwise as is. (I barely get any use out of the AMD card at this point and it was kind of a money waste) If you're doing this 24/7 it's possible you'll run into some invisible limitation on the service, I do not know. These services are cool when you play around with GPU computing in general and just sometimes need an enterprise card for some project but obviously don't want to shell out for one.

There's also ways to use them as regular graphics cards with VirtualGL and Linux, I managed to do so once with Google Colab before they neutered that, I don't know if it works with paperspace but couldn't see why not. At least with colab the latency was too arse and the CPUs too bad to really do any serious 3D-accelerated program on it.

They do charge you extra each month for harddrive space if you take up too much, so be careful about that.

Otherwise there's NAI that also offers text generation up to OPT (?) 20b I think. I don't know about their payment scheme though and the entire service seems to be targeted at generating porn really. Then again, maybe that is what you want. There's also Midjourney I haven't really tested post-SD but it was impressive before that. Expensive tho.
 
Don't know if this is old new by now, but dezgo.com is an in-browser AI image generator powered by Stable Diffusion 1.5. I don't know how the experience of using it compares to you much more technologically-capable chaps have been up to, but I'm just excited to be able to make Shreks.
I'm starting with image-to-image. The first thing I tried to make was a Lego character eating another one like Francisco de Goya's Saturn Devouring His Son. Now I'm putting in Moraff's World monsters, with Greg Rutkowski in the prompt. It's not going too well but I can see how people could get absorbed in this, even from using this shitty but free 1 image per shot version. If nothing else, it's good practice before committing to a service or GPU purchase.

You have to remember to resize or clean up images before you use them as input. It doesn't know what to do with a 128x128 sprite but you resize it to 512x512 and it's fine.

Upgrading the graphics of Moraff's World with AI would be an extremely autistic endeavor.

Lego_character_eating_a_baby_in_the_style_of_Francisco_de_Goya_542837545.pngLego_character_eating_a_baby_in_the_style_of_Francisco_de_Goya_1743921273.pngLego_character_eating_a_baby_in_the_style_of_Francisco_de_Goya_2896943593.pngHooded_figure_and_woman_in_boat_2360981061.pngDungeon_filled_with_monsters_Greg_Rutkowski_1666317947.png

This is one thing that can happen when you don't crop a black bar out:

Dungeon_filled_with_monsters_Greg_Rutkowski_2190506022.png

The attempted horns and blood spray are from the ceiling and walls:

Ogre_Greg_Rutkowski_3168969835.pngOgre_Greg_Rutkowski_2563110183.pngOgre_Greg_Rutkowski_2137714300.pngOgre_Greg_Rutkowski_1946710045.pngOgre_Greg_Rutkowski_4165724372.pngOgre_Greg_Rutkowski_132046710.pngOgre_Greg_Rutkowski_2909454413.pngOgre_Greg_Rutkowski_2700342067.pngOgre_Greg_Rutkowski_2919842568.pngLego_man_eating_a_Lego_baby_Francisco_Goya_720869845.pngLego_man_eating_a_Lego_Francisco_Goya_2830134129.png

Now for some text-to-image:

tornado_coming_out_of_toilet_horror_destruction_bathroom_672436530.pngsmiling_Egyptian_Pharaoh_standing_and_holding_a_beer_pyramids_in_background_in_color_1627358166.pngtriumphant_black_King_standing_with_arms_crossed_pyramids_in_background_2111371613.pngFractal_mouse_in_meadow_hallucinogenic_acid_trip_499235613.pngFractal_mouse_in_meadow_hallucinogenic_acid_trip_837868455.pngMan_using_meat_grinder_horror_in_the_style_of_Francisco_Goya_1162884133.pngYellow_Lego_Brick_Man_using_meat_grinder_horror_in_the_style_of_Francisco_Goya_2353652942.pngYellow_Lego_Brick_Man_eating_bloody_meat_horror_in_the_style_of_Francisco_Goya_s_Saturn_Devour...pngYellow_Lego_Brick_Man_eating_bloody_meat_horror_in_the_style_of_Francisco_Goya_s_Saturn_Devour...pngYellow_Lego_Brick_Man_eating_bloodied_baby_horror_in_the_style_of_Francisco_Goya_s_Saturn_Devo...pngRealistic_Lego_man_eating_baby_horror_in_the_style_of_Francisco_Goya_s_Saturn_Devouring_His_So...pngLego_man_eating_baby_horror_Francisco_Goya_Saturn_Devouring_His_Son_3272031864.pngLego_man_eating_baby_horror_Francisco_Goya_Saturn_Devouring_His_Son_3561470526.png

Much better attempts at Lego man horror with text-to-image. Smiling Black Pharaoh man standing in front of pyramids was largely unsuccessful.

I deleted a couple for looking too nude.

NSFL Bonus
Prompt: front gunt man
Aspect: 1 to the right of square
Guidance: 8
Seed: 3735664101
 
Last edited:
Anything that'd be classified as "Ecchi" should also be clearly indicated the same as NSFW. The visible layer of this thread should be strictly work-safe, like for Starbucks.
Edit: i'm fucking retarded/
Thanks for the clarifications!
 
What's the difference between Stable Diffusions 1.4 and 1.5?


And once again in language a blockhead like me can understand?
 
Me and Mr.
I added the porn packages to my custom package.
Not safe for existence.

She thanks you a lot for this. Really appreciates this .She's been trying to pull out mixed prompts, while I have been prompting every weird idea I've had for the last couple of days.
She wanted to know, that if you're training a model, would you release it and would it have pants suited women in it?
Me and Mr. Birds are wrapping up a few final experiments and then he is going to come and talk to everyone about how to train your own AI.

He said that merging too many modules together can cause problems because the modules will recognize different prompts and this can lead to weighting issues with the prompts. I thought I would mention it since you said you are mostly merging modules to improve your results. Mr. Birds may also be wrong about this since we haven't wasted much time on merging and went straight for directly training our AI.

The training module and the cloud-hosted training method (DreamBooth) both require some basic python knowledge. If it sounds like too much then I will get with you and try to figure out a way to give you a module trained to do whatever your wife would like. We've been tackling a lot of basic items like teaching it what weapons are, improving its knowledge of faces and hands, dynamic poses, two figures interacting, etc.

We're also working on teaching the AI to recognize new art styles, which has been... interesting.

Prompt: the virgin mary, smiling, with big, soft eyes, detailed, realistic, masterpiece, illustration, (((colorful))), radiant, fantasy, lackadaisy

Result:
virginmarycat.png

Tracy J. Butler/Lackadaisy:
tracyjbutler-lackadaisy.jpg

PS- Lackadaisy is the insanely cool story of bootlegging prohibition era cats. If you like flivvers, tommy guns, flappers, and cats, then check out her site. She offers the bulk of her comic free for everyone to enjoy. Yeah, it's furry adjacent, but it's basically like if Walt Disney imagined the 1920s. With cats.
 
I am going to talk about DreamBooth using @Starved Artist 's example of women's pantsuits just because it is a useful tool if you need a single thing done particularly well.

Here is a comprehensive walkthrough of how to use DreamBooth

This is Byrme, a tool for bulk cropping images to 512x512

A list of all artists StableDiffusion Recognizes (this is because I'm going to talk about teaching your AI new styles a little at the end of this)

A place for discussing prompts

A Hugging Face project that will randomly spruce up your prompt with extra prompts

Okay, those are all the cool links. So, we started off training NovelAI on DreamBooth. We're no longer doing that because of things I'm about to discuss, but I also feel DreamBooth has its uses.

Basically, you gather up all your images and upload them to the cloud. They have to be organized into folders. These folders become the prompts that will be added for the images in the folder. So if you want women's pants suits, you'd name the folder 'Women's Pants Suits' and after the training was finished you would have a module that could render baller women's pants suits. I say a module that does one thing because there are some problems inherent in trying to teach it a bunch of things at once. To make this easier I'm going to explain what it does poorly in a 'cons' section and what it does well in a 'pros' section and beyond that good luck.

Pros:
With a decent collection of images and adequate time to train on them it will in fact create a module that is really good at what it's been trained to do. It's great if you want to add something extremely specific like putting your face on other people's bodies, or getting the AI to understand what women's pants suits are.

Cons:
First, it isn't practical to train it on multiple things at once unless they're related. For instance, you might want to remind it what a woman is while you're training it on women's pants suits. This would probably work okay. But if you wanted to teach it a bunch of wildly different things their prompts all wind up with the same weight and you'll get results that bleed together. For instance if I try to teach it 'Jesus' and 'NASCAR' then it would probably throw something like Jesus driving a car. And I mean it will do this fairly often, so it can work against you as well as in your favor.

Training multiple subjects with large batches of images is similarly unreasonable with DreamBooth because it's shared among many users. Paying customers get first dibs on the training resources, and if too many people are trying to connect you can wind up with your training session interrupted. If you time out the module is unusable. DreamBooth takes a really long time to train (think 12 hours on average) and timing out is a regular thing, so it can get frustrating. Small batches run slightly faster, which is why I really do recommend it for small batches of images devoted to one specific thing. Then just merge the module.


What we're now running is Stable Diffusion with the training module on the same machine. It's considerably faster and more versatile, but I don't speak Python so I can only pester Mr. Birds until he comes and explains it. I will say that teaching it a style requires big batches of images and takes a while, but it does in fact learn.

Here's some 1925 Packards rendered in Tracy J. Butler's style from my last post. One of the reasons I picked her is because she's not on the list of known artists as far as I know. I'll be back later to talk about what happens if you try to train further on an artist it knows once we've tried it. We're going to use Boris Vallejo. Currently training for Dorian Cleavanger (who I also believe is missing from the known artist list).

Training where you find a blank spot is really fun because you can see the results of the training more easily. This seems to also true of DreamBooth.
lackacars.JPG
 
NovelAI apparently launched the newest beta version of their furry model (v1.3) today. No idea how it stacks up against (what I presume are many at this point) other models out there based on furry art, but it does seem more coherent than the older version. In the changelog, the developers wrote "Give it a try and don't overlook the possibilities of this model, it can do so much more!" which makes me curious about what they mean. At the very least, it'll be interesting to see if this causes any further waves of artist tears from the furry community, though that's probably dependent on how it performs compared to other models in imitating specific artists.

Here's "a slobbering mutt." It lacks a certain je ne sais quoi the original has, but I'm sure someone can get more accurate results out of this model with some patience.
a slobbering mutt.png
 
This is a nice list to just go through! I quite like that one of the examples for "Dr. Seuss" appears to be a picture of the actual Theodor Geisel with an attempt to add a silly Seuss-style hat and facial hair.

(And Klimt is literally just remixes of The Kiss.)

I've found it really valuable for mocking up prompts because I can't remember that many artists. The Hugging Face prompt generator will usually tack on a couple of artists/styles, but it tends to lean on the same handful of people, so the list is helpful.

I heard that the example images for the artists are images that have been confirmed as being in the Stable Diffusion library through some autistic Secret Squirrel process.

Anyway, for training the biggest problem is over-training which is why we want to see what happens when trying to train something it already knows, which really makes the comprehensive list a useful tool.

Our AI knows Dorian Cleavenger now so next up is finding out if addional training for something Stable Diffusion recognizes improves or ruins the results.
CleavengerTraining2.pngCleavengerTraining.png
 
Back