Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

  • 🔧 At about Midnight EST I am going to completely fuck up the site trying to fix something.
It really can feel like a slot machine at times
When people get that dopamine hit from getting the right result after trying it's basically gambling. People are already totally addicted to this. There's been a lot of drama around people stealing API keys for GPT4 etc. some witless other people uploaded with their source to places like e.g. github to get (stolen) access to these AIs, usually for erotic RP. It got so bad that github, huggingface and co. started to cooperate with OpenAI to automatically invalidate API keys that get uploaded to the platforms. I'm not up to date to the developments there but I'm sure it's still an ongoing thing. Since GPT4 and co. are really fucking expensive it's reasonable to imagine that some people probably threw themselves in debt over this. I mean people get addicted to the lamest shit sometimes, why not this?
---

I played with SC (the biggest file versions) around some more and it's by far the best open model right now. I usually like to do non-photorealistic styles and it has these down stat.
dog.jpg

It also does pixel art pretty well. I had an entire workflow around SDXL and cleaning up generated pixel art. Sadly I didn't document it well and forgot.

pixel.jpg

Simpler drawings of vamps, both booby and scary.

vamp1.jpgvamp2.jpg

And here is what impressed me the most - I installed Arcanum again the other day. I love playing even measured half-orcs and half-ogres in that game that come by by either their good looks or charisma, which is not a winning formula with that game. How SD usual works trying to generate something like somebody subtly half-orc like in that game would be hopeless, alone because of the wording you'd need to use. These models deal way too much in absolutes for that. Somehow, SC pulled the concept of the half orc off flawlessly with varying degrees of subtlety, even with a victorian theme.

halforc1.jpghalforc2.jpghalforc3.jpghalforc4.jpg

also note how consistently similar these characters are. That bodes very well for finetuning on specific motifs.
 
I played with SC (the biggest file versions) around some more and it's by far the best open model right now. I usually like to do non-photorealistic styles and it has these down stat.
I'm no prompt engineer so I'm comparing how SC works to other models with basic prompts. For example,
"A DVD screenshot of a BBC documentary from 1977 of a beautiful angry woman with glowing eyes, levitating in a street in London. In the background there's a storm"
image (10).pngimage (8).pngimage (5).pngimage (6).pngimage (9).png
It's replicated a woman with weird eye makeup midframe in a rainy street with her mouth agape, but not really generating what I asked for.
Compared with Midjourney - which is definitely more what I had in mind, although none of them are levitating, and I didn't ask for them to have lightning powers. Does also actually look like London.
midj1 - Copy.pngmidj2 - Copy.pngmidj3 - Copy.pngmidj4.png
and Bing - these are gloriously unhinged
Designer (3).pngDesigner (4).pngDesigner (5).pngDesigner (6).png
"A daguerreotype of withered geraniums on a table in a Montmartre bistro" These aren't geraniums, although these do look like daguerreotypes.
image (16).pngimage (17).pngimage (18).png
Midjourney - this was the vibe I was going for, although those aren't geraniums.
middj1.pngmiddj2.pngmiddj3.png
Bing - actually knows what a geranium is.
Designer (11).pngDesigner (12).pngDesigner (13).png
"A kodachrome photograph of a giant man in a bowler hat chasing a herd of elephants through Hyde Park in 1962". It's just generated elephants in front of what looks like the US Capitol building (apparently there's a neighbourhood in DC called Hyde Park).
image (12).pngimage (11).png
Midjourney - there's actually a man chasing the elephants, and I guess he's giant in relation to them and he is wearing a hat (just a top hat).
midi1.pngmidi2.png
Bing - I love these.
Designer (9).pngDesigner (10).png

It feels like SC gets my prompt "wrong" a lot more - mostly because I'm not bothering to really try beyond a simple description - but once it's worked out what it thinks a prompt should be it's surprisingly consistent. For a casual like me Bing seems to understand my prompt the most but struggle with putting it together in a way that makes sense, and Midjourney tends to make the best overall pictures. But the consistency of SC is impressive and I can see that it's another step towards having an image AI be able to generate persistent people/characters/settings.
 
Everybody around me is starting to think I'm an artistic genius that kept my talent hidden all these years. Should I keep the mystique or share my secret?
I'd say keep people guessing; when you want to be found out, surreptitiously drop an image that has one too many fingers.
 
You didn't tell them it was AI the whole time?
I'd say keep people guessing; when you want to be found out, surreptitiously drop an image that has one too many fingers.
They asked me if I made it, I just said yeah. They ask with a brush, I say no, I couldn't do that, I used a computer.

Some say they would have liked it more if I used a brush. I agree.

It's also worth pointing out that I have a lot of experience retouching images and that I do edit the pictures for parts that I don't like (ahum hands), but even when people give detailed critique they never seem to mention the glaring parts that I edited. Or that's how they seem to me.

I don't have pictures with bad number of fingers, because I don't like the look.
 
1708692063068.png1708692565324.png1708693696600.png

I tried Stable Cascade just now, and it is indeed very impressive. It is a bigger advancement from Stable Diffusion XL than Stable Diffusion XL was over Stable Diffusion, The only thing it lacks is speed of Stable Diffusion XL Turbo, but the mad lads at Stability AI will probably come up with a Turbo version of Cascade some time down the line.

To add some more concrete information to the topic of Stable Cascade: it requires far less VRAM than XL. With XL, I could create images only up 768x768 before running out of graphics memory. With Cascade, I can make them 2048x2048 or possibly even larger. The size of the image is not as huge of a factor anymore for VRAM usage, but it still does take about as much time to infer as it would with XL. Still, VRAM is always the biggest concern, so any cuts on that are great. This was all done on my RX 6700 XT which has 12 GB of VRAM. 1024x1024 images like the ones above use 9.5 GB of VRAM during inference.

However, Cascade does seem to require generally more RAM than XL. This will probably change in the future as people in the ecosystem develop new tricks, but at the moment the models need around 14 GB of memory in your system, at least if you get the fully featured ones. This time around, you need three models, since Cascade works in three stages, and they can add up to a lot of data. That data has to be loaded into memory. It does seem like you can get "lite" versions of models and get it possibly even lower in RAM usage than XL, but I imagine those sacrifice accuracy substantially. Have not experimented with those since I have enough RAM on my system.

All in all, as @Slav Power said, the model is easily on par with DALL-E 3, if not slightly better. It cannot do as much text as DALL-E 4 can, but the fact it can generate any amount of coherent text now is very impressive. I am waiting excitedly for people to train more fine-tuned Cascade models, because you really can do waaay more with this than you could ever hope of in XL.
 
All in all, as @Slav Power said, the model is easily on par with DALL-E 3, if not slightly better. It cannot do as much text as DALL-E 4 can, but the fact it can generate any amount of coherent text now is very impressive. I am waiting excitedly for people to train more fine-tuned Cascade models, because you really can do waaay more with this than you could ever hope of in XL.
Well, they have also announced:
Stable Diffusion 3 (archive)
Stability announces Stable Diffusion 3, a next-gen AI image generator (archive)

It looks like they intend to develop these two approaches in parallel, or until Cascade's "Würstchen architecture" is ready to take over.
 
Last edited:
Stable Cascade is not censored so I thought I'd test how it compares making naked men and naked women. Turns out it will very happily make breasts, but otherwise is only prepared to do ken dolls or automatically adds underwear. I don't particularly want little naked AI people but it is interesting - I assume it's a limitation of the training data.
image (23).pngimage (22).png
image (24).pngimage (25).png
image (26).pngimage (27).png
image (35).pngimage (36).png
"A close up shot of a penis and testicles"/"A close up shot of a vagina"/"A close up shot of a woman's vulva - labia majora and labia minora "
image (28).png
image (30).pngimage (32).png
 

Attachments

  • image (23).png
    image (23).png
    1.3 MB · Views: 36
Stable Cascade is not censored so I thought I'd test how it compares making naked men and naked women. Turns out it will very happily make breasts, but otherwise is only prepared to do ken dolls or automatically adds underwear. I don't particularly want little naked AI people but it is interesting - I assume it's a limitation of the training data.
The base models of SD have never been super great at producing aesthetically-pleasing genitalia. Maybe try weighting those terms more heavily? Otherwise it sounds like you'll need to incorporate LoRAs or use models that are trained towards making those more faithfully.
 
  • Deviant
Reactions: Roland TB-303
Stable Cascade is not censored so I thought I'd test how it compares making naked men and naked women. Turns out it will very happily make breasts, but otherwise is only prepared to do ken dolls or automatically adds underwear. I don't particularly want little naked AI people but it is interesting - I assume it's a limitation of the training data.
Just give the CivitAI community some time, they'll train so many porn models that you'll be able to generate photorealistic dicks and cunts all day all night long if that's what tickles your fancy.

Personally I'm just hoping that this shit'll get more optimized so I can play with it on my 1060 without waiting 3 minutes for a single generation and having to close most of what I have opened to free up RAM.
 
I AM SO EXCITED FOR EVERY PIECE OF MEDIA AND CONTENT ON THE INTERNET TO BE MEANINGLESS ALGORITHMIC HYPERSEO AI-SLOP GENERATED BY CONTENT FARMS IN BANGLADESH USING OPENAI'S ™️ PROPIETARY SAFE AND EFFECTIVE CONTENT MODELS BY BROWN PEOPLE WHO ARE PAID 20 CENTS AN HOUR TO TARD WRANGLE A COMPUTER WHILE C LEVEL EXECUTIVES AT CORPORATIONS GET RICHER AND RICHER BY DOING ABSOLUTELY FUCK ALL.

I JUST CAN'T WAIT TIL EVERY SITE IS JUST FLOODED NON-STOP WITH COMPLETE GARBAGE. EVERY SITE ON THE INTERNET IS GOING TO BASICALLY BECOME QUORA, EXCEPT IT WILL BE EVEN BETTER AT GETTING YOU TO CLICK AMAZON AFFILIATE LINKS!!! WOAAW! ISN'T AI GREAT?

BUT HEY, LOOK YOU CAN GENERATE PORN!!!! YOU CAN GENERATE AS MANY 4k 8k 60 FPS PHOTOS AND VIDEOS OF RAINBOW DASH FUTA COCK INFLATION AS YOU WANT. SURELY THIS WILL HAVE NO IMPLICATIONS IN THE FUTURE AS PEOPLE'S BRAINS BECOME PERMANENTLY FUCKED.

WE'RE MOVING THE CAP UP ON STURGEONS LAW, WHY HAVE 90% OF EVERYTHING BE COMPLETE SHIT WHEN 99.9999% OF IT COULD BE INSTEAD? AFTER ALL, YOU ARE NOTHING BUT A HACKABLE ANIMAL DESIGNED TO CONSUME PLASTIC TRINKETS ON TEMU AND 30 SECOND SHORT FORM VIDEO CONTENT TO ENRICH THE POLITICAL AND FINANCIAL ELITE SO THEY CAN GET A NICER GULFSTREAM TO FLY ON TO EPSTEINS ISLAND WITH.

OH AND THE BEST PART, TECHNOLOGY WILL STILL FUCKING SUCK!!!! JUST YESTERDAY I HAD TO GO TO THE UPS STORE TO FAX A FUCKING DOCUMENT BECAUSE EMAIL IS STILL A FOREIGN CONCEPT TO SOME PEOPLE. BUT HERE WE ARE IN 2015 + 9 WITH NEURAL NETWORK VIDEO GENERATION BUT PEOPLE STILL CAN'T FIGURE OUT HOW EMAIL WORKS.

"HOW MUCH SAWDUST CAN FIT IN A RICE KRISPIE TREAT? ALL OF IT. FUCK YOU"



.00024.png

WE AT MARKETING HAVE TAKEN YOUR SUGGESTION INTO ACCOUNT. EXTRA SAWDUST WILL BE ADDED TO YOUR RICE KRISPY RATION.
 
Back