Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

  • 🔧 At about Midnight EST I am going to completely fuck up the site trying to fix something.
I took around 1,000 images from the SJW Art and Extremes thread, and am using them to train a massive negative LoRA(which I will share of course when it's done.)

Here are some example results:

new1.png
 
Saw this and very impressed by how realistic most of it looks. I'd mute the music, though!



In other news, Stability AI seems to have managed to secure some further funding. Still don't really know what their business model is supposed to be or if this funding will be enough to help them stagger on, but time will tell. Hope we get to see a public release of the 8B model some day. They still say they're going to:
 
The ridiculous clauses attached to the lobotomized, inferior SD3 release must be internal sabotage. I refuse to believe anyone in that company read their own inane bullshit monetization model and said 'this will definitely go well with customers'.
This tech is moving way too fast for me to keep up with. Do other competitors allow the same freedom to create your own models from scratch like SD does? Every decent model whether it's image generation, music creation, or a chatbot like ChatGPT, all seem to only allow you to interact with a carefully curated, pre-made, trained by the company, lobotomized and pozzed to all hell model that they allow. Not SD, from my understanding they're one of the only ones who just tossed the whole tech onto the public to tinker with.
This must have pissed every other company off. You don't just toss out such powerful tools to the proles, doubly so that they've been provided for free. For this Stability AI has to be destroyed, from the outside or inside, doesn't matter. That's why SD3 is a disaster. The tech will advance further, but the only ones who should be benefiting from it are those that keep the power mostly to themselves and allow customers to only use a curated model, not a company that allowed the average joe to go hogwild, tossing away the tools to create relatively convincing 'deep fakes'. Next thing you know, they'll release a ChatGPT equivalent that will actually allow you to train a model that'll be rational and logical, one that'll call mental illness a mental illness, or will actually agree that it's okay to say a racial slur to prevent millions of deaths in certain hypothetical scenarios.
You can't have that. The genie is out of the bottle so to speak when it comes to SD, you can't undo what they've already released, but you sure as hell won't get the latest and the greatest. Stability as a company that brought you SD1.5/XL will soon cease to exist, if it hasn't already.
 
This tech is moving way too fast for me to keep up with. Do other competitors allow the same freedom to create your own models from scratch like SD does? Every decent model whether it's image generation, music creation, or a chatbot like ChatGPT, all seem to only allow you to interact with a carefully curated, pre-made, trained by the company, lobotomized and pozzed to all hell model that they allow. Not SD, from my understanding they're one of the only ones who just tossed the whole tech onto the public to tinker with.
What's hilarious is how SD ate shit over the Taylor Swift deepfakes thing, when most of the images were generated using Microsoft's tools by abusing their shitty filtering. There were dozens of threads on /g/ and other parts of 4chan about how to evade their NSFW limitations. Most of the holes seem to be patched now, but people spent months generating images before that, and those images are still getting reposted.

The recent news on the open weights front is the Open Model Initiative (reddit link). It sounds somewhat promising, but they are terrified of the CSAM issue. Their solution is to remove all children from their training data.
  • We plan to curate datasets that avoid any depictions/representations of children, as a general rule, in order to avoid the potential for AIG CSAM/CSEM.
Besides that, there are more niche models like Pixart and Lumina. It seemed like people were holding off on starting anything new, waiting to see how SD3 would turn out before investing a bunch of time and money in an alternative. Now that Stability seems to be collapsing, there is space for new entrants to try to become the next big thing in open source.
 
Seems like almost every online AI tool or generation site has "sign in with Google" (never works), is nonfunctional, has limited free use, or some combination of such BS.
It’s still a new industry and they’re trying to figure out how to monetise. You can’t charge a subscription if nobody knows who you are so you offer a free service and mine data through Google to make a bit of cash.
 
Hello everyone, I got stable diffusion 1.5, and i'm using the webui from Automatic1111. I'm trying to get into training a model and using dreambooth to do so but I'm having difficulty trying to find the dreamsbooth extension alone. There's a lot to read through this thread, so what would you do to cut the time in training your model from your own library of images?

Also hearing about stable diffusion 3 made me get the older version so thank you guys.
 
Hello everyone, I got stable diffusion 1.5, and i'm using the webui from Automatic1111. I'm trying to get into training a model and using dreambooth to do so but I'm having difficulty trying to find the dreamsbooth extension alone. There's a lot to read through this thread, so what would you do to cut the time in training your model from your own library of images?

Also hearing about stable diffusion 3 made me get the older version so thank you guys.
Instructions on how to download and install Dreambooth can be found here. I have trained many LoRAs using the kohya-ss utility, it I’ve never touched Dreambooth so I cannot be of much assistance there. Fortunately, there do exists tutorials all over the place online.

Again, cannot speak for actual model training, but can give some insights into LoRA training times. On my 4090 with ~100 or so reference images and ten epochs at roughly 200 steps, it takes anywhere from 13~17 hours to complete.
 
Again, cannot speak for actual model training, but can give some insights into LoRA training times. On my 4090 with ~100 or so reference images and ten epochs at roughly 200 steps, it takes anywhere from 13~17 hours to complete.
Are you training SDXL? What res? That's way too high a time, especially for a 4090. I can train an SDXL LoRA at 1024x1024 with 50 images on one repeat for 50 epochs in about 2 and a half hours on a 4070. Are you accidentally using the dreambooth tab instead of the LoRA tab?
 
Instructions on how to download and install Dreambooth can be found here. I have trained many LoRAs using the kohya-ss utility, it I’ve never touched Dreambooth so I cannot be of much assistance there. Fortunately, there do exists tutorials all over the place online.

Again, cannot speak for actual model training, but can give some insights into LoRA training times. On my 4090 with ~100 or so reference images and ten epochs at roughly 200 steps, it takes anywhere from 13~17 hours to complete.
I'll try training stable diffusion 1.5 with lora since everytime I add dreamsbooth into the extensions folder, it stops the webui from working.

Are you training SDXL? What res? That's way too high a time, especially for a 4090. I can train an SDXL LoRA at 1024x1024 with 50 images on one repeat for 50 epochs in about 2 and a half hours on a 4070. Are you accidentally using the dreambooth tab instead of the LoRA tab?
I have reference images, now how do you two feed the model said images with Lora?
 
Instructions on how to download and install Dreambooth can be found here. I have trained many LoRAs using the kohya-ss utility, it I’ve never touched Dreambooth so I cannot be of much assistance there. Fortunately, there do exists tutorials all over the place online.

Again, cannot speak for actual model training, but can give some insights into LoRA training times. On my 4090 with ~100 or so reference images and ten epochs at roughly 200 steps, it takes anywhere from 13~17 hours to complete.
Did you turn xformers on, and what are your network dim and alpha? I usually train in the range of 3,000-4,000 steps and it takes me roughly 30 minutes to bake a LoRA.
 
I'll try training stable diffusion 1.5 with lora since everytime I add dreamsbooth into the extensions folder, it stops the webui from working.


I have reference images, now how do you two feed the model said images with Lora?
Your first step will be tagging the images for the Lora. Automatic111 should have a tagging module that will generate tag text files for each image. If all of your images share a similar tags you may wish to remove the similar tag, and use a singular trigger tag.

For example, if all of the characters in your set are tagged "tall" you may wish to remove the "tall" tag. Place your trigger word at the front of each tag file. It can be anything you want, as long as it's consistent.

Put those files into a directory structure of
"/imgs/<Your Lora name here>_1/"

Then select the imgs directory in kohya as the images folder.
 
I have reference images, now how do you two feed the model said images with Lora?

Basically if you're training character LoRAs, you're tagging the subject and everything else that you don't want the model to learn as the subject. That said, I prefer manually going through my images afterwards to tweak tags as sometimes the automatic tagger misses out on some stuff or adds tags that I want to be part of my subject. I suggest using BooruDatasetTagManager if you're using booru-style tagging; makes life so much easier.
 
Hello everyone, I got stable diffusion 1.5, and i'm using the webui from Automatic1111. I'm trying to get into training a model and using dreambooth to do so but I'm having difficulty trying to find the dreamsbooth extension alone. There's a lot to read through this thread, so what would you do to cut the time in training your model from your own library of images?

Also hearing about stable diffusion 3 made me get the older version so thank you guys.
My workflow is using LoRA_Easy_Training_Scripts for the training and TagGUI for tagging and organizing images. I don't think anyone really uses Dreambooth anymore. If you're confused about config settings, download a LoRA you like from CivitAI, look at the metadata, and copy what they are doing.
 
  • Like
Reactions: Baraadmirer
My workflow is using LoRA_Easy_Training_Scripts for the training and TagGUI for tagging and organizing images. I don't think anyone really uses Dreambooth anymore. If you're confused about config settings, download a LoRA you like from CivitAI, look at the metadata, and copy what they are doing.
Are there youtube videos, resources, etc...you have used to learn what you've given me for people to follow using Kohya? And theres alot of information to go through if say someone wants to train their models using Lora from the beginning just to get started.

Especially when this technology is moving fast and the resources posted in the OP may be outdated like automatic1111 not working properly.
 
Are there youtube videos, resources, etc...you have used to learn what you've given me for people to follow using Kohya? And theres alot of information to go through if say someone wants to train their models using Lora from the beginning just to get started.

Especially when this technology is moving fast and the resources posted in the OP may be outdated like automatic1111 not working properly.
I hope you've been pulling from AUTOMATIC1111's GitHub repository semi-frequently to keep it updated.

There are YouTube users that give tutorials on Stable Diffusion, like Sebastian Kamph, though admittedly many of them have moved on to talk about SDXL or SD3, and any AUTOMATIC1111 UIs they may use are most likely outdated by now.
 
Lenovo touting AI image generation for non-binary girls who "first started using the Internet seven years ago [when I was 12]" in order to be their Authentic Selves. Including creepy uncanny-valley avatar with Annoying Orange style lips.




Why Lenovo, a business-focused vendor, thought this was a good ad for them I don't know. 3.1M views but only 84 Likes. This is a text book example of why YouTube took away downvotes.

Well, scratch Lenovo off the list of any future purchases! I hope the kid isn't too scarred by this in the future.
 
Lenovo touting AI image generation for non-binary girls who "first started using the Internet seven years ago [when I was 12]" in order to be their Authentic Selves. Including creepy uncanny-valley avatar with Annoying Orange style lips.


View attachment 6144292

Why Lenovo, a business-focused vendor, thought this was a good ad for them I don't know. 3.1M views but only 84 Likes. This is a text book example of why YouTube took away downvotes.

Well, scratch Lenovo off the list of any future purchases! I hope the kid isn't too scarred by this in the future.
Looks like they did another with a fat japaneese woman:
And also apparently you can use an AI Queen Latifah made by them to create adds for your business. I had no idea they were doing this and it is truly horrifying.
 
I'm having problems running automatic1111 webui. I use to be able to run this thing but now I'm getting this error:

File "/home/name/Documents/automatic/stable-diffusion-webui/modules/launch_utils.py", line 386, in prepare_environment
raise RuntimeError(

RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

I've installed the latest torch version, my gpu has the latest cuda version etc....
I hope you've been pulling from AUTOMATIC1111's GitHub repository semi-frequently to keep it updated.

There are YouTube users that give tutorials on Stable Diffusion, like Sebastian Kamph, though admittedly many of them have moved on to talk about SDXL or SD3, and any AUTOMATIC1111 UIs they may use are most likely outdated by now.
Do you use comfyUI or automatic 1111?

and whats this about a config file? What is it do you need when training stable diffusion or any AI like this for say silly tavern? I got images that should have captions in .txt files, stable diffusion 1.5 pruned safetensorfile. What else do you need?
 
Last edited:
Back