Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

Train Operator · Apr 29, 2024

Overly Serious said:
So here's something which I noticed in the SD3 API documentation

now these images don't seem that much better in quality than sdxl outputs to me, what are your thoughts on SD3 model so far? besides improved text handling and the Search and Replace feature does it seem to make better images or follow prompts closer than previous SD versions?

Overly Serious · Apr 29, 2024

Nabol O'tyrin said:
now these images don't seem that much better in quality than sdxl outputs to me, what are your thoughts on SD3 model so far? besides improved text handling and the Search and Replace feature does it seem to make better images or follow prompts closer than previous SD versions?

Eh, I think the quality is a little better considering this is the base model and the proper comparison would be to the basic SDXL model when released, prior to any community checkpoints. But regardless of that, it is definitely superior at following prompts. Not perfect, but noticeably better.

Involuntary Celebrity · Apr 29, 2024

inception_state said:
In my experience this is not the case. I get much better results by generating an image in common aspect ratios (eg. 832x1216, 1024x1024, 768x1344, etc) and then upscaling. Even base SDXL does not generate coherent images when you go significantly above those sizes. Just as an example, here's the Kim Possible example image, then the same parameters with no upscaling and a base image size of 1216x1792, then the same thing but base SDXL. It brings back the classic stretched torsos, duplicated body parts, etc. Also, if you look how LoRAs are trained, images are generally normalized to 1024x1024. I have done a few for fun, and the tooling will reduce a 2048x2048 to 1024x1024, 1664x2432 to 832x1216, etc.

View attachment 5945419 View attachment 5945415 View attachment 5945420

We're both wrong. I misremembered the SDXL lower range since there's a model I use that strongly recommends 1024 and up; 832 is within the recommended range. 768 is under though, fyi. However the training data resolution doesn't matter in the way you think it does since there's a scaling stage in the pipeline.
In your case I think the issue with torsos is actually an aspect ratio thing: SDXL is especially optimised for square aspects and gets weirder the further you get from that. Sticking close to square gens is in the recommendations. Off the top of my head I'm not sure why lower res rectangular gens are more stable, but if it works it works.

I usually gen squares with XL but I've had some success with strong negatives for multiple girls/deformity/multiple torsos etc when that issue pops up btw. There's probably a few negative embeddings for XL on Civit that could help if you haven't tried that.

inception_state · Apr 30, 2024

Involuntary Celebrity said:
We're both wrong. I misremembered the SDXL lower range since there's a model I use that strongly recommends 1024 and up; 832 is within the recommended range. 768 is under though, fyi. However the training data resolution doesn't matter in the way you think it does since there's a scaling stage in the pipeline.
In your case I think the issue with torsos is actually an aspect ratio thing: SDXL is especially optimised for square aspects and gets weirder the further you get from that. Sticking close to square gens is in the recommendations. Off the top of my head I'm not sure why lower res rectangular gens are more stable, but if it works it works.

I usually gen squares with XL but I've had some success with strong negatives for multiple girls/deformity/multiple torsos etc when that issue pops up btw. There's probably a few negative embeddings for XL on Civit that could help if you haven't tried that.

I was talking about the scaling stage. Eg. here's output from the logs for a Lora I trained the other day, with sd-scripts (https://github.com/kohya-ss/sd-scripts):

bucket 0: resolution (768, 1280), count: 6
bucket 1: resolution (832, 1216), count: 75
bucket 2: resolution (896, 1152), count: 12
bucket 3: resolution (960, 108, count: 12
bucket 4: resolution (1024, 1024), count: 60
bucket 5: resolution (1088, 960), count: 6
bucket 6: resolution (1152, 896), count: 4
bucket 7: resolution (1216, 832), count: 43
bucket 8: resolution (1280, 76, count: 6

It took all my images of various sizes and scaled them relative to 1024x1024 with buckets of (1024 + 64n, 1024 - 64n). Most community models are trained using this set of tools I believe, so it makes sense that models would perform better on the resolutions that they are specifically trained on.

Yeah, I'm not sure why lower res rectangular gens are more stable, but it's pretty easy to just add an upscaling step at the end. I often don't bother if I'm just testing stuff out, but it has the added benefit of cleaning up any small imperfections or artifacts.

Involuntary Celebrity · Apr 30, 2024

inception_state said:
I was talking about the scaling stage. Eg. here's output from the logs for a Lora I trained the other day, with sd-scripts (https://github.com/kohya-ss/sd-scripts):

It took all my images of various sizes and scaled them relative to 1024x1024 with buckets of (1024 + 64n, 1024 - 64n). Most community models are trained using this set of tools I believe, so it makes sense that models would perform better on the resolutions that they are specifically trained on.

Yeah, I'm not sure why lower res rectangular gens are more stable, but it's pretty easy to just add an upscaling step at the end. I often don't bother if I'm just testing stuff out, but it has the added benefit of cleaning up any small imperfections or artifacts.

Nah, you're talking about ordinary resolution normalisation during data preprocessing; I was referring to something in the actual SD process (autoencoder decoding).

Gourmet Race · May 2, 2024

Someone made a live-action Simpsons using AI. It's incredibly well made.

DeadwastePrime · May 2, 2024

Gourmet Race said:
Someone made a live-action Simpsons using AI. It's incredibly well made.
View attachment 5954811

man this looks like shit

Dick Mason · May 2, 2024

Gourmet Race said:
Someone made a live-action Simpsons using AI. It's incredibly well made.
View attachment 5954811

Milhouse is just the kid from Wonder Years who may or may not be Marilyn Manson's staid alter ego. What kind of rip is that. Toss that AI into the woodchiper; back to the darwin board.

Sinner's Sandwich · May 2, 2024

Gourmet Race said:
Someone made a live-action Simpsons using AI. It's incredibly well made.
View attachment 5954811

I want this instead of nu Simpsons

AnOminous · May 2, 2024

Gourmet Race said:
Someone made a live-action Simpsons using AI. It's incredibly well made.

I like the Sideshow Bob at the end.

Sinner's Sandwich · May 3, 2024

The characters are already moving slightly. Imagine them actually moving. 50's Simpsons epidodes are near.

macrodegenerate · May 3, 2024

I'm thinking about getting the ML certificate from Google. Is this a decent learning path?

BrunoMattei · May 5, 2024

I'd love to get my hands on Sora but there's no public version let alone a local version where you can do whatever. Does anyone know of any alternatives?

Get the rope Macaulay! · May 5, 2024

Sinner's Sandwich said:
The characters are already moving slightly.

thats the new thing in AI videos, have them move slightly so it looks impressive. Like the other guy said though its obvious a lot of times the "inspiration" behind certain actors. the simpsons is the best one but they have plenty of videos where they "1950s cinemascope" an IP

Involuntary Celebrity · May 5, 2024

Gourmet Race said:
Someone made a live-action Simpsons using AI. It's incredibly well made.
View attachment 5954811

i really fucking hate this channel
the gens are fine but the editing/pace and writing for the voiceovers are so bad that it sucks all potential enjoyment out of it

85% of them are Wes Anderson jokes too so it's not even just that the gimmick is stretched out far too long in one video, it's being tortured on the rack indefinitely

Slav Power · May 5, 2024

Gourmet Race said:
Someone made a live-action Simpsons using AI. It's incredibly well made.
View attachment 5954811

As others have stated, the subtle movements and zoom-ins to make the characters seem "alive" is really cheap and stupid. However I think this highlights a great use for ML image generation, which is concept prototyping. The imagery it spewed out is a pretty neat way to visualize "Simpsons but real", and for example it can be used by movie studios to easily figure out how should a proper live-action Simpsons movie look like.

Of course with time it might get so good that they'll just auto pump out slop to sell, but if it doesn't, this use case is still viable.

Gender: Xenomorph · May 5, 2024

Anything as good as bing/copilot but not as cucked?

Also I'm trying (again) to get into Ai art. Any good setup guide/training guide? (I've a better rig now)

macrodegenerate · May 5, 2024

Gender: Xenomorph said:
Anything as good as bing/copilot but not as cucked?

What do you want from the chatbot? There are a ton of Illama 2 finetunes with 13B GGUFs that may suit your needs. Although Illama 3 just came out which will probably be fine-tuned in the new few months.

Gender: Xenomorph said:
Also I'm trying (again) to get into Ai art. Any good setup guide/training guide? (I've a better rig now)

You can't go wrong with https://github.com/AUTOMATIC1111/stable-diffusion-webui

As for models that's up to you. SDXL is the latest thing you can run locally until SD 3 comes out. There are also fine-tuned checkpoints on Civitai like Pony Diffusion v6 which is good for anime and other things... If you want to do training or fine-tuning, start with SD 1.5 and skip SD 2. There are plenty of guides.

Gender: Xenomorph · May 6, 2024

macrodegenerate said:
What do you want from the chatbot? There are a ton of Illama 2 finetunes with 13B GGUFs that may suit your needs. Although Illama 3 just came out which will probably be fine-tuned in the new few months.

You can't go wrong with https://github.com/AUTOMATIC1111/stable-diffusion-webui

As for models that's up to you. SDXL is the latest thing you can run locally until SD 3 comes out. There are also fine-tuned checkpoints on Civitai like Pony Diffusion v6 which is good for anime and other things... If you want to do training or fine-tuning, start with SD 1.5 and skip SD 2. There are plenty of guides.

That's a good start.

I just want to generate stuff like this without bing censor going "ERROR! ERROR! NON-AMORPHOUS GENDERBLOB DETECTED! PROMT BLOCKED!"

Sinner's Sandwich · May 6, 2024

Gender: Xenomorph said:
That's a good start.

I just want to generate stuff like this without bing censor going "ERROR! ERROR! NON-AMORPHOUS GENDERBLOB DETECTED! PROMT BLOCKED!"

View attachment 5965361

I'm using Artbreeder since like 4 years - long before the AI craze started. I can highly recommend it.

Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

Train Operator

conductor on the glow rail

Overly Serious

Involuntary Celebrity

気持ち悪い

inception_state

Involuntary Celebrity

気持ち悪い

Gourmet Race

Today's a Good Day

DeadwastePrime

pronouns in bio

Dick Mason

SUPER HEEBSTAR

Sinner's Sandwich

AnOminous

SOMEBODY SET UP US THE BOMB

Sinner's Sandwich

macrodegenerate

Generative AI was a mistake

BrunoMattei

No I am not the Cinema Snob

Get the rope Macaulay!

weird sperg

Involuntary Celebrity

気持ち悪い

Slav Power

I drink to forget.

Gender: Xenomorph

Pronouns: Xe/Xer

macrodegenerate

Generative AI was a mistake

Gender: Xenomorph

Pronouns: Xe/Xer

Sinner's Sandwich