Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

  • 🔧 At about Midnight EST I am going to completely fuck up the site trying to fix something.
Is there a specific reason you are not running it locally? From my testing you can basically run it on GPU cards that are 2 generations old. With obvious speed setbacks. But manageable.
My last GPU got fried during a power surge, so I've been using an RX 570 while I waited for GPU availability to get better. I'm still waitlisted for a V100, so until that purchase goes through I'm using cloud computing.
If you are not planning on combining models to make your own models or to train your own model using these models, then getting fp16 models basically has no drawback and it can cut the size in half.
Yeah, that's a problem for me. I've used FP16 models in the past, but I often have cases where I want to pull characters from model X and put them into Style Y (Nyxborn Pokemon, me as a Phyrexian, etc).
I’ve started downloading all the models and loras I want to try because a lot are getting removed. TOS changes, legal issues, artists getting mad and demanding it taken down, FUCKING EXCLUSIVITY DEALS where they sell the model to AI generating websites and remove it everywhere like what happened recently with the Illuminati model (the cocksuckers 🎩) etc. So I download everything I think looks cool, save a PDF of the download page so I have info about triggers and best settings, and then store it away.
Patches-Final-scaled.jpg
Once enough phones on the market are able to run SD without connecting to servers, it's all over for free high-end models. I already see it in my mind's third eye, you're gonna have apps like Wombo buying off well-trained models and selling them as in-app purchases.

Voice.ai's releasing their STS service on mobile this month, so I'm willing to bet Midjourney and the other vultures are trying to get this tech running locally on mobile devices.
 
So I've been going hog-wild generating NPC portraits in Novel for a espionage GURPS game I'm running. Portraits are anime-ish style, just because I find the quality's a bit more consistent and I have limited skill.

A lot of the portraits are serviceable and aren't worth talking about, but then I got to generating one for a mole character (not literal) and it's a real existential abomination because now I've effectively sold my soul to AI because of it.

beret girl_102312.png
She is cute and precious and I must protect her.

An attempt was also made to give her a more professional alter-ego, and I like it, but nothing will ever match the first one.
beret girl no beret 2_103543.png
 
Last edited:
View attachment 5042017View attachment 5042020

Testing SD 2.1, since I’ve been almost exclusively working with 1.5 and NAI based models. Even the regularization images for the dreambooth training are looking fresh.
Waifu Diffusion 1.5 beta 2 aesthetic version is the best looking 2.1 model to me. Those Jeet chink fags struck gold with 1.2 and 1.3 and it seems they're finding it again after that massive flop that was Waifu Diffusion 1.4(that brief period where everyone was a "muh ethics" fag). It might get popular because I found out you can make coomer shit with it.
 
There's a tweet that is currently trending that shows AI being able to animate real people dancing. Do note: The replies are full of coping and seething from artists. :story: You know you've won when all they can do is cherry pick certain frames.
 
Last edited:
There's a tweet that is currently trending that shows AI being able to animate real people dancing. Do note: The comments are full of coping and seething from artists. :story: You know you've won when all they can do is cherry pick certain frames.
Even without getting into specific frames it's immediately obvious that it has a problem with consistency between frames. It's cool that it's essentially created an easier form of rotoscoping but the technology is still in a nascent stage and will need either better logic for animation or to be part of a larger pipeline that refines the raw output into a viable end product before it can be used outside of more surreal and/or psychedelic media.
Speaking of which, I stumbled on a music video that leverages AI art in a creative way to create some really cool dreamlike visuals:
 
Last edited:
Even without getting into specific frames it's immediately obvious that it has a problem with consistency between frames. It's cool that it's essentially created an easier form of rotoscoping but the technology is still in a nascent stage and will need either better logic for animation or to be part of a larger pipeline that refines the raw output into a viable end product before it can be used out of more surreal and/or psychedelic media.
Yeah, I can see it not being consistent, like the clothes and background, but compared to back then, it's massively improved. I believe that in the near future AI art is definitely going to be on par with the stuff you see on the internet. It's pretty insane how a year or two, when I first saw AI try to animate a real person, it looked awful, and to see it progress to at least looking aesthetically good, but still distinguishable is just wow.
 
There's a tweet that is currently trending that shows AI being able to animate real people dancing. Do note: The replies are full of coping and seething from artists. :story: You know you've won when all they can do is cherry pick certain frames.
View attachment 5044285
Thank god traditional animation never has weird artifacts
9E3B69AB-23F2-4F6B-B025-34C7577050B7.jpeg
CCF301F9-5D1F-4D08-A829-4F72DE12EC6C.jpeg8EA10535-3BF8-4EFD-A628-35E33B394040.jpeg0E2507F1-F3DA-495F-8ACF-8A762F25032A.jpeg

Just render the girl naked and animate the clothes on a second layer, redrawing when they aren’t good enough. It’s literally what I’d do when animating a character wearing stuff that needs simulated physics
 
Even without getting into specific frames it's immediately obvious that it has a problem with consistency between frames. It's cool that it's essentially created an easier form of rotoscoping but the technology is still in a nascent stage and will need either better logic for animation or to be part of a larger pipeline that refines the raw output into a viable end product before it can be used out of more surreal and/or psychedelic media.
Speaking of which, I stumbled on a music video that leverages AI art in a creative way to create some really cool dreamlike visuals:
Still has ways to go, but it's definitely gotten much better than it used to be over the span of a few months.
 
Waifu Diffusion 1.5 beta 2 aesthetic version is the best looking 2.1 model to me. Those Jeet chink fags struck gold with 1.2 and 1.3 and it seems they're finding it again after that massive flop that was Waifu Diffusion 1.4(that brief period where everyone was a "muh ethics" fag). It might get popular because I found out you can make coomer shit with it.
Waifu Diffusion is a pretty popular one, but I think someone made a coomer version that works better for that purpose called Hentai Diffusion. And speaking of the whole “ethics” thing —

View attachment 5042017View attachment 5042020

Testing SD 2.1, since I’ve been almost exclusively working with 1.5 and NAI based models. Even the regularization images for the dreambooth training are looking fresh.
Stable Diffusion 2 varies a lot in content and quality. They literally removed and made it harder to prompt all artists and even famous people. They removed a lot from the training data. So 2.0 without any addon or merge, has little to no idea about art. Which is really funny to me.
 
Has anyone had any luck with AI accelerator cards? I'm thinking of buying some google coral stuff.
Yeah, I know that buying from google is a crapshoot considering that they're part of the movement trying to put the genie back in the bottle but they're pretty much the only horse in town if you don't want to shell out for a big fuck-off GPU. They've got USB devices and all sorts of cards for various motherboard slots and there's drivers for whatever OS you want.
 
Wanted to see what difference class actually made on Dreambooth training, so I generated the same image on 3 models: 1 image generated as a vanilla point of reference, one following my normal training method without class images, and one using my normal source-to-class image ratio.
regularization test.png


For reference, here's a sample from the dataset.
serra.pngbenalish hero.jpg

EDIT: I thought this class image was kinda neat, so I'm posting it here:
02cdcf20c78f8aa5090034f3ee8bf5eca86af327.png
 
Last edited:
Wanted to see what difference class actually made on Dreambooth training, so I generated the same image on 3 models: 1 image generated as a vanilla point of reference, one following my normal training method without class images, and one using my normal source-to-class image ratio.
View attachment 5060872

For reference, here's a sample from the dataset.
View attachment 5060882View attachment 5060891

EDIT: I thought this class image was kinda neat, so I'm posting it here:
View attachment 5061202
Do you have any tutorial or something about training models or loras?
I have tried to learn stuff but it feels like i'm just swimming in the dark.


00027-3880247440.png00024-4137302156.png

00016-2357953129.png00010-2432726362.png
 
Do you have any tutorial or something about training models or loras?
I have tried to learn stuff but it feels like i'm just swimming in the dark.


View attachment 5063126View attachment 5063127

View attachment 5063128View attachment 5063129
I tried writing some pointers for Dreambooth training a while back, but the Automatic extension changed drastically between then and now. It also doesn't help that the process is different between SD 1.5 and SD 2.0/2.1.

That said, I'll try and draft up a more clear-cut tutorial. Lemme know if you want me to focus more on styles or people.
 
Is there a name for this type of animation yet? If not I dub it "noiseimation". It's where you use img2img on an image with denoising turned down low and generate multiple outputs with different seeds then put them all together in an animation. This one I call "In the Multiverse of Chadness", made from 12 images with denoising strength at 0.15.

(click thumbnail for full size. Tiny thumbnail doesn't show the effect very well.)
itmoc.gif
 
Is there a name for this type of animation yet? If not I dub it "noiseimation". It's where you use img2img on an image with denoising turned down low and generate multiple outputs with different seeds then put them all together in an animation. This one I call "In the Multiverse of Chadness", made from 12 images with denoising strength at 0.15.

(click thumbnail for full size. Tiny thumbnail doesn't show the effect very well.)
View attachment 5066351
That level of noise is reminiscent of the Take On Me video, although I'm sure there's a better example somewhere.

 
Is there a name for this type of animation yet? If not I dub it "noiseimation". It's where you use img2img on an image with denoising turned down low and generate multiple outputs with different seeds then put them all together in an animation. This one I call "In the Multiverse of Chadness", made from 12 images with denoising strength at 0.15.

(click thumbnail for full size. Tiny thumbnail doesn't show the effect very well.)
View attachment 5066351
Reminds me a bit of squigglevision, like you'd see with Dr. Katz and Home Movies, though that was more on character outlines than the detail. Cool effect.
 
I tried writing some pointers for Dreambooth training a while back, but the Automatic extension changed drastically between then and now. It also doesn't help that the process is different between SD 1.5 and SD 2.0/2.1.

That said, I'll try and draft up a more clear-cut tutorial. Lemme know if you want me to focus more on styles or people.
I'm mostly interested in SD 1.5 and styles more than people, but would be appreciative of anything.

Model: 4moonNI
VAE: vae-ft-mse-840000-ema-pruned

Postive Prompt: 1girl, chromatic abberation, aesthetic, attractive, athletic, kigurumi or camisole or dress shirt, vest, navel cutout,
Negative Prompt: badv5, EasyNegative, (low quality, worst quality:1.4), (bad anatomy), (extra long neck), (open clothes), (inaccurate limb:1.2), bad composition, inaccurate eyes, extra digit, fewer digits, (extra arms:1.2), lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name

Step 1: Text2Img
0-00018-3087026559.0.png

Step 2: Img2Img "Tiled Diffusion + Tiled VAE" 2x Upscale
1-00039-3087026559.0.png

Step 3: Img2Img "Tiled Diffusion + Tiled VAE" 4x Upscale
(Diminishing returns, but changes are seen if you open the pic to full resolution)
2-00042-3087026559.png

Step 4: TEST Img2Img "Tiled Diffusion + Tiled VAE" 6x Upscale + 80 more steps
1681830070675.png
 
Back