Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

Civitai is starting to do a massive crackdown on content and the owner confirms it's payment processors pushing it.
View attachment 7267868
edit:
There is some decent discussion of the issue on reddit as well: https://old.reddit.com/r/StableDiffusion/comments/1k6zf81/the_real_reason_civit_is_cracking_down/

This is 9-11 for menstrual blood diaper piss fetishists.
View attachment 7268919
That affects like 0.1% of the users, the other changes are way more impactful.

Content Filtering & Visibility Changes​

We’re making system-level changes to make certain types of content and tags less visible across the platform when browsing with X or XXX enabled:

  • Real people and celebrities: Content tagged with real person names (like "Tom Cruise") or flagged as POI (real-person) resources will be hidden from feeds.
  • Minor content: Content with child/minor themes will be filtered out of feeds.
  • Missing Metadata: Existing X & XXX rated content that lacks generation metadata will be hidden from public view and marked with an alert, giving the content owner a chance to add the required details - at minimum, a valid prompt. Note that content without metadata will not be removed/deleted from the system, but will remain hidden - visible only to the uploader - until updated.
This change won’t take effect immediately - you’ll receive a notification if any of your uploads are affected.
The real people change is annoying, it's going to discourage people from making humor/parody content involving real people. NSFW is already banned entirely.

The killer though is the metadata change. Some people have uploaded hundreds or thousands of images using workflows that do not easily support adding metadata. All that content is going to get nuked unless they manually go back through all their uploads and try to find metadata to add. That's going to be a complete shitshow.

Generator Changes​

  • Images created with Bring Your Own Image (BYOI) will have a minimum 0.5 (50%) denoise applied. Images generated on-site, or remixed from other site images, will have no denoise restriction - for example, hi-res fix will still allow the 0.0-1.0 denoise values.
  • With X & XXX browsing levels visible, celebrity names will be blocked. Using celebrity names in combination with mature contexts continues to be prohibited.
This effectively ruins Bring Your Own Image, presumably to prevent deepfake NSFW content of real people. It screws over any fun uses of this though, eg. putting a picture of a person you know into the DBZ super saiyan image->video LoRA.

Updated Monetization Policies​

To tighten our policy around real people and ensure we're staying far away from monetizing likeness without consent:

  • Ads will not be shown on images or resources intended to reproduce the likeness of real people.
  • Tipping (Buzz) will not be possible on images or resources intended to reproduce the likeness of real people.
  • Early Access will not be available for any model intended to reproduce the likeness of real people.
This isn't a huge deal, probably legal ass-covering.
 
The real person change is bizarre. They're all still there, but if you leave "Show me everything including NSFW" then they're hidden. Go into your profile, turn off NSFW and they're all back.
 
I'm not sure any of these changes are really that big of a deal. The stuff they're banning is the most disgusting, degenerate shit on the website and i'm glad it's going to be gone.

The Metadata stuff while annoying, isn't gonna be a thing anyone cares about in a week since AI images are compleatly disposable and CivitAI is basically a porn site with 90% of the photos posted being just some degens goon material that day.

The real person changes are the only ones that are slightly inconvenient just because now you have to go into your settings and explicitly turn off the X and XXX browsing levels to see them which i don't see the point of since you can't post generations of real people over the PG rating anyway, couldn't really make political statements with them or use them with the on-site generator period. (I guess it's to not have the celeb models associated with gooning models in any way but it could be implemented better)

Unless i'm missing something, I don't see this as a big problem that will kill the website and If these changes were implemented on their own and not because of payment processors i wouldn't have any real issue with a majority of them.
 
Which is funny as like 1/3 of the 'celeb' models are adult stars, who are apparently considered 'SFW'.
I legit don't see the point in those models. I'm under no delusion that anyone is downloading models of famous attractive woman for anything other then to make nudes of them so why bother making a model of a fucking porn star? you can already see them doing whatever horrifying sex act you want for free online, why not just goon to that?
 
What's the system resources needed just to generate still images at standard definition?
 
I highly recommend using this addon on top of Framepack, adds some much needed boosts to generation speed and some QoL additions to the UI. Despite what is says on the tin the software is weird when it comes to utilizing RAM and GPU VRAM, this helps alleviate the issue. Went from 15-30 seconds per iteration down to 2 as it should have been.
Bookmarked.
 
What's the system resources needed just to generate still images at standard definition?
Depends on what kind of software and models you'd like to run. Using ComfyUI, with Illustrious, Pony, Stable Diffusion etc. I found myself using up around 13 RAM and 8GB of VRAM, uncertain about the CPU usage as that's never been a point of discussion anywhere I looked. These models are fairly versatile when it comes to styles, realism etc. and generation speed is roughly 1,00-1,72 it/s. 50 seconds for 30 steps is legit good pace especially since I run an image through multiple passes.

However, I dabbled with things like Flux and HiDream, which are supposedly more capable at things like generating text, but those chew up a lot more RAM and VRAM surpassing my computer's capabilties. It's possible to run downscaled versions in GGUF format much like an LLM, but even then the generation speed is slow and you lose quality.

All in all, the more you have, the better, but at least start with 16GB RAM and 8GB VRAM. Most of all get yourself an SSD so those gigabyte-large models load in seconds.
 
What's the system resources needed just to generate still images at standard definition?
Like Rembrandt said, it's all about the VRAM. 8 GB is probably the bare minimum for SDXL, my Flux workflow uses ~18 GB but you can fit it in 16 GB with some optimizations (that have speed/quality tradeoffs). Nvidia cards have better software support and therefore generally faster generation times, but come with a price premium.

What's nice is that there are services where you can rent high-end GPUs for like $0.20-0.40 an hour, so you can try stuff out without dropping a couple thousand dollars for your own hardware. There are a bunch out there now, runpod, vast, etc.
 
All in all, the more you have, the better, but at least start with 16GB RAM and 8GB VRAM. Most of all get yourself an SSD so those gigabyte-large models load in seconds.
Like Rembrandt said, it's all about the VRAM.
So that's why online AI generators usually have BS like "sign in with Google", needing "Premium account", etc: at least because lots of computing resources used @ their end?

And how come so much system resources are used for image generation anyway?
 
And how come so much system resources are used for image generation anyway?
First of all, VRAM. A diffusion model has a limited amount of data it can have in it's dataset, and for it to be effective it needs a lot of it. Stable Diffusion 1.5 models were ~2GB, SDXL models are ~6.5GB, and Flux.1 dev is ~16GB. Each larger than another, having more data in the dataset and capable of more. The entire model has to be loaded into VRAM as running it off of system RAM or the drive would be tremendously slow.

For example, the full Deepseek R1 model that has 671 billion parameters, which is the real deal that you use on their website, is 404GB. You'd need a proper Nvidia server rack with GPU's connected via NVLink for a grand total of those 404GB of free VRAM to run it, and those would cost you as much as a brand new house. In comparison, the smaller model that has 32 billion parameters is only 20GB. It is way less detailed than the full model, but still fairly capable and you could run it on a single 3090 with 24GB of VRAM.

Second of all, computation power. Everything, from latent diffusion to LLM's, are simply heavy arithmetic operations on mathematical matrices. That's why the 20 series was much better at those workload than the 10 series, as it introduced dedicated cores for matrix computations. The Tensor cores. GPU's are perfect for this type of workload, as they have tens of thousands of cores that can do such calculations in parallel.

It's also the reason why Nvidia became such a market giant and asserted dominance in the space with CUDA. Jensen predicted this is where tech will go so very early on Nvidia created a framework that lets you utilize the same cores used for your video games to do such computations.

And of course, it's the reason for which VRAM is important, as the computations are done on the GPU, and the VRAM is the fastest memory connected to it, right on the board. Anything else would be bottlenecked by the CPU. Reading from an NVMe drive? Now you're adding the PCIe lanes to the equation. System RAM? That's the CPU. VRAM? No bottleneck, it's a direct connection, one that you need for good performance with a workload this heavy.
This video was probably already linked here but it explains how it all works. The concepts of how LLM's work also apply to latent diffusion models, since they're all transformer models stemming from the 2017 research paper called "Attention Is All You Need". All the current "AI" models began with this paper.
 
The entire model has to be loaded into VRAM as running it off of system RAM or the drive would be tremendously slow.
Depending on what you want to do. I'm running FLUX.1 on a Mobile 3070, 8GB VRAM, system with 40GB system RAM in about 2-3 minutes per generation. It's hitting swap hard on a fast NVMe but working. Obviously not fast enough for 'real' work but enough to check it out.
 
For example, the full Deepseek R1 model that has 671 billion parameters, which is the real deal that you use on their website, is 404GB. You'd need a proper Nvidia server rack with GPU's connected via NVLink for a grand total of those 404GB of free VRAM to run it, and those would cost you as much as a brand new house. In comparison, the smaller model that has 32 billion parameters is only 20GB. It is way less detailed than the full model, but still fairly capable and you could run it on a single 3090 with 24GB of VRAM.
Not quite. The small 20GB models you're talking about, qwq and Deepseek-Qwen, are really just the Qwen model that has been trained to think like Deepseek. It's remarkably good for what it is, but it isn't actually Deepseek. This is not so obvious if you ask it a reasoning question, but glaringly clear when you ask it a fact question.

What you could do is run the unsloth Deepseek R1s. These are a set of pruned versions of the real R1, ranging from 131GB to 212GB, which is still huge but can be run in a homelab setup (for example 128GB RAM and two 3090s with an NVlink lets you run a ~160GB model reasonably well. You'd load the most heavily accessed layers of the model into your GPUs, and the rest sits in RAM and runs on the processor. It's still quite slow compared to the cloud model, but this is the way to go if you want to actually run Deepseek at home.
 
I highly recommend using this addon on top of Framepack, adds some much needed boosts to generation speed and some QoL additions to the UI. Despite what is says on the tin the software is weird when it comes to utilizing RAM and GPU VRAM, this helps alleviate the issue. Went from 15-30 seconds per iteration down to 2 as it should have been.

Received a new update.

  1. Latent Window Size parameter added
  2. States (Presets Manager)
    a) It can save settings,prompts
    b) Load last state ( Magic :) , on the next launch or page reload it can restore your last used settings,prompts )
    c) Your settings will be inside "states" directory under "webui" folder. You can share your settings and prompts with people.
 
This is probably a stupid question, but searching around hasn't given me any answers. Can an LLM, or any 'ai' model be iterated on?

I mean, can they take a model that's been made and keep the quality but make it smaller/quicker/cheaper? From where I am sitting we are hitting 'good enough' quality from some models - like Claude 3.7 for writing, and some of the image generators, and now it's just the price that is the issue, taking 60-70% of the cost off suddenly makes the high quality top end models very usable.

But from my admittedly ignorant research, I've seen some people say that the models are essentially black boxes, people make them, but have little idea how they actually work, and so once it is done 'baking' there is little that can be done to change or improve them.
 
This is probably a stupid question, but searching around hasn't given me any answers. Can an LLM, or any 'ai' model be iterated on?

I mean, can they take a model that's been made and keep the quality but make it smaller/quicker/cheaper? From where I am sitting we are hitting 'good enough' quality from some models - like Claude 3.7 for writing, and some of the image generators, and now it's just the price that is the issue, taking 60-70% of the cost off suddenly makes the high quality top end models very usable.

But from my admittedly ignorant research, I've seen some people say that the models are essentially black boxes, people make them, but have little idea how they actually work, and so once it is done 'baking' there is little that can be done to change or improve them.
They are essentially blackboxes, but there have been massive improvements in reduction of resources needed. Though every iteration changes in a way that nobody fully understands. So are you losing something fundamental? Maybe? Nobody really knows is I think the honest answer.
 
This is probably a stupid question, but searching around hasn't given me any answers. Can an LLM, or any 'ai' model be iterated on?

I mean, can they take a model that's been made and keep the quality but make it smaller/quicker/cheaper? From where I am sitting we are hitting 'good enough' quality from some models - like Claude 3.7 for writing, and some of the image generators, and now it's just the price that is the issue, taking 60-70% of the cost off suddenly makes the high quality top end models very usable.

But from my admittedly ignorant research, I've seen some people say that the models are essentially black boxes, people make them, but have little idea how they actually work, and so once it is done 'baking' there is little that can be done to change or improve them.
That's an active area of research, there are a number of different approaches. Eg. quantization - basically changing weights into lower-precision data types. This reduces accuracy a bit but significantly reduces the VRAM required. Or there's distillation, where you use a resource-intensive model to train a smaller model to give similar responses. A good example of this is DeepSeek, where they fine-tuned smaller Qwen and Llama models to perform better using DeepSeek-R1 as the "teacher".
 
Back
Top Bottom