Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

Rololowlo · Jun 10, 2024

Little tip to get around censorship if you use online generators: type the censored words in a different language.

French works well, which is curious when you think about it; you would assume the French words would still be used by native English speakers in some pretentious circles, like in cuisine, art or fashion, but the AI will draw them regardless.

Girl Named Sandoz · Jun 10, 2024

crowabunga said:
This reminds me, does anyone know anything about PonyXL drama? I recall hearing that the model's developer obfuscated a bunch of tags, and also won't share those obfuscated tags with people, but I expect I'm getting the technical details wrong.

they specifically obfuscated artist tags. it turns out that the model actually knows quite a lot of artists and styles by default without needing LORAs. not to worry, the dedicated gooners on 4chan have been mining them and there's a spreadsheet: https://lite.framacalc.org/4ttgzvd0rx-a6jf

it's for porn obviously so the sample links are all NSFW

macrodegenerate · Jun 11, 2024

Girl Named Sandoz said:
they specifically obfuscated artist tags. it turns out that the model actually knows quite a lot of artists and styles by default without needing LORAs. not to worry, the dedicated gooners on 4chan have been mining them and there's a spreadsheet: https://lite.framacalc.org/4ttgzvd0rx-a6jf

it's for porn obviously so the sample links are all NSFW

Seems kind of bad, but I can understand why. I've trained LORAs that can create work that looks almost indistinguishable from the artists they are based on. It is probably discouraging for this to happen to artists who are good. So obfuscating tags makes sense to encourage artists to produce more content. I don't know why they wouldn't just remove those tags though.

Overly Serious · Jun 11, 2024

AmpleApricots said:
This honestly can happen in tons of software. Millions of dependencies, very little oversight what's even in them, all it takes is one github to be compromised or one developer with a hidden agenda. These are called supply chain attacks, not exactly what happened here but comfyui also has a rats nest of dependencies. I would always sandbox applications like Comfy. The open source community is far too trusting to run abritary code. The amount of times some anonymous, random literally who just links his github on reddit and says "hey people run this" and people actually just do is insane.

Very much this. I'd say there's a further problem with Stable Diffusion in that its user base are so much less technical than most projects with this level of rough and ready code. Obviously not all. There are technical users such as yourself who have good knowledge of the software, there are technical users such as myself who have good knowledge in general or our areas but not that much familiarity with the software (takes time to learn stuff even if you have the skills); and then you have the follow a guide post for help on reddit crowd which makes up a far larger proportion of this project than most. If someone wrote their own module for Pulumi and stuck it on github, the only people at risk would be people who are equipped to look at it and go "that's not right". It's not 100% guaranteed they will but the community would be a Hell of a lot more resistant than a bunch of enthusiastic guide followers who just want to make booby elf women. (mostly).

Thankfully, places like Github are developing the tools to detect such malicious code in real time. Using AI!

We're getting close to the point, I think, where someone uploads some blatantly malicious code and the site itself flags it and raises concerns. It wont be foolproof but it can catch a lot of the low-hanging fruit. (Of course that could just create over-confidence on the part of users who then trust the more sophisticated mal-packages, but hey....)

Anyway, in the "I read reddit so you don't have to" section:

ComfyUI support for SD3 just dropped:

https://github.com/comfyanonymous/ComfyUI/commit/8c4a9befa7261b6fc78407ace90a57d21bfe631e

We are (allegedly) one day from the 2B version of SD3 being released.

Someone linked this quite interesting article about colour bias in SDXL.

https://huggingface.co/blog/TimothyAlexisVass/explaining-the-sdxl-latent-space#the-8-bit-pixel-space-has-3-channels

https://archive.md/wip/qb4ei

In short, he says that SDXL has a bias towards yellow due to an absence of blue in its training data, he shows how its used colour space is out of the bounds of what would be a normal colour space and then wrote a bunch of code so far as I can see dynamically corrects colours during generation. Lots of comparision images. He has an interactive demo where you can compare the impact of his different colour correction techniques to the unaltered image:

https://timothyalexisvass.github.io/sdxl-correction/

Note, this isn't generating an image in real time. He has made a matrix of 300-something possible combinations of the techniques at different stages, I think. So the images vary slightly in composition but it's enough for you to see the effect of colour correction. I found it pretty cool.

Finally, I think I found one of the worst posts on the Stable Diffusion subreddit:

https://archive.ph/wip/hdywW

First the guy starts drawing unexplained analogies between the trained resolution of SD3 and Composite vs. S-Video vs Component cables. Saying that something may be the same resolution but a better quality. The analogy makes little sense and is just trying to dodge the fact that SD3 doing 512x512 resolution isn't a bad thing. It is. Then you get guff about how there aer over 7,500 papers on Google Scholar that build on the SD model and how "all of this knowledge could be potentially transferred to newer newer architectures [sic]". The majority of that is citation farming and none of it is about SD3 specifically. Then a bunch of cope about how 2B isn't a "skimped model" because "if the 8B model is undertrained a much smaller model can outperform it". Well sure, IS that the case? And are you saying the SD3 8B version IS undertrained?

I don't know why I'm reporting on this post here other than that it annoyed me and it's perhaps the most perfect example of someone dressing up absolutely no information or insight in a bunch of high-flown language and logical fallacies and getting away with it I've ever seen. It's like the Jabberwocky scene from Better Off Ted in real life. But that's 90% of reddit I guess.

Anyway, imo, we ARE getting a lesser model by it being the 2B version. I suspect most of the stuff about "we want as many people as possible to be able to run it" is post-fact spin on the fact they're not ready yet with larger models. I base that on the fact that people serious about this should be able to get the hardware to run larger models or already have it. And the 512x512 resolution if this is finally substantiated, is just piddly and crap.

Nonetheless, I am keen to see SD3. My experience with it via the API shows a lot of potential, though I suspect that's a larger than 2B version.

Finally, apropos of nothing, this image I generated amused me as a great example of how generative AI can go down a wrong path.

I was playing around with what the ICBINP model was capable of as it has really impressed me with its realism. I wanted to give it something a little outre and asked it to generate an image of supergirl specifically flying high above a city with a view from above. As you can see, it started with the essentials but then its generative nature extended things to the ground and did things like add a shadow which is still attached to the ground. The end result is a weird and forced perspective on a 16m tall supergirl that still has realistic detail. Weird in all sorts of subtle ways.

macrodegenerate · Jun 11, 2024

SD is just such a waste of time for me. Same with LLMs. My ML setup in the basement hasn't been used for a month or two because I uninstalled all my ML stuff.

BrunoMattei · Jun 11, 2024

Is there an offline model that plays nice with AMD?

Overly Serious · Jun 12, 2024

macrodegenerate said:
SD is just such a waste of time for me. Same with LLMs. My ML setup in the basement hasn't been used for a month or two because I uninstalled all my ML stuff.

Is that because you don't have a specific goal and have just generally lost interest, or because you have a specific goal but it just isn't feasible right now?

BrunoMattei said:
Is there an offline model that plays nice with AMD?

I mean, using ROCm I am able to use any of the Stable Diffusion based models very effectively. With 20GB VRAM on my AMD card it blasts out 1024x1024 SDXL images fast enough that I can try something, see the results and then make changes based on them. I use ComfyUI to do so.

The catch is that ROCm on Windows is waaay behind. You need to do it on Linux. But the hardware side works well.

Overly Serious · Jun 12, 2024

Well pardon the double post but nobody else has replied in the meantime and I feel this is significant. SD3 is now publically available. In theory not for another 15 minutes but I was able to download the weighted model and sample workflows for ComfyUI. So... back later with some locally produced images and comparisons, hopefully.

I also intend to do a bit of a like for like comparison with their API version which they let slip is in fact the 8B version, once I have got things sufficiently nailed down to ensure I'm not accidentally introducing some difference. So we'll get to see how much difference the missing 6B Parameter (man) makes.

indomitable snowman · Jun 12, 2024

Cool, keep us posted. If you can try some examples of the same prompts on other sd models to compare that would be neat.

BirdUp · Jun 12, 2024

SD3 is looking great

DavidS877 · Jun 12, 2024

My default filters for Civitai have been updated to have SD3 enabled automatically apparently.
I usually just search for SDXL stuff.
Guess I could try it.

macrodegenerate · Jun 12, 2024

Overly Serious said:
Is that because you don't have a specific goal and have just generally lost interest, or because you have a specific goal but it just isn't feasible right now?

There was a Lora I spent a year or so trying to train, and I finally managed to. I decided there wasn't any point after that.

Overly Serious · Jun 12, 2024

Well, I'm glad I don't work for Stability AI today.

indomitable snowman said:
Cool, keep us posted. If you can try some examples of the same prompts on other sd models to compare that would be neat.

I can happily do you some comparisons between models, e.g. SDXL (though not much point in doing base, might as well pick a tuned model) and SD3. But that said, if you've seen Human Centipede then you already have an idea what SD3 is like.

Okay, marginally more seriously, it's okay but seems to have a real bias towards anime and cartoonish style and it seems to get its understanding of human anatomy from John Carpenter's The Thing. Maybe fine tunes will lead to more impressive results. Thing is, the new commercial licence has already ruffled feathers. The PonyXL people have already said it makes it impossible for them to do a Pony version based on it. I don't use PonyXL because it seems to be largely focused around porn (unless I'm wrong) but if that's indicative of other players it could be a problem.

The API version I was using was and is noticeably better than this. And they confirmed that the API is running the 8B version, not the 2B they've just released.

inception_state · Jun 12, 2024

Yeah, people seem very unimpressed with SD 3. Reddit is full of people posting messed up anatomy and bad gens, it seems to have a significantly worse understanding of anatomy than SDXL.

Overly Serious said:
he PonyXL people have already said it makes it impossible for them to do a Pony version based on it. I don't use PonyXL because it seems to be largely focused around porn (unless I'm wrong) but if that's indicative of other players it could be a problem.

It does porn, but it also has very good danbooru-style tagging and a bunch of baked-in artist styles for 2D, anime, cartoons, etc. It has basically cornered the market on that segment for SDXL, especially since people started training LORAs specifically for Pony and Civitai added it as a base model in their searching/filtering system.

Here's the article discussing the licensing issues: https://civitai.com/articles/5671

Baraadmirer · Jun 12, 2024

So is SD3 supposed to be less resource-intensive than Cascade or XL?

Overly Serious · Jun 12, 2024

inception_state said:
Yeah, people seem very unimpressed with SD 3. Reddit is full of people posting messed up anatomy and bad gens, it seems to have a significantly worse understanding of anatomy than SDXL.

My normal view of that subreddit is whiny entitled people who want everything for free and yesterday. At present however, I feel they may have a point. I'll still give it time to see how it turns out. It's definitely better in some ways so it might be a better base for building higher in time. At present I feel they made a mistake in not releasing the 8B version. Having that out there would go a long way to deflating some of the criticisms as I don't think it's got anywhere near the same issues.

inception_state said:
It does porn, but it also has very good danbooru-style tagging and a bunch of baked-in artist styles for 2D, anime, cartoons, etc. It has basically cornered the market on that segment for SDXL, especially since people started training LORAs specifically for Pony and Civitai added it as a base model in their searching/filtering system.

Maybe I'll grab it then. I've veered away from the more prurient models but if it's more general purpose than I thought maybe I will. The thing is that independent of that either way, it's a very, very successful family of models. For it to ditch the possibility of working on SD3 is potentially a big deal in terms of community engagement. There was also the use of the term "Dunning-Kruger" in conversation with one of the PonyXL people from someone at SAI. It's not looking good. Basically they'd be limited to 6,000 generated images per month under the licence. Which is nowhere near viable.

inception_state said:
Here's the article discussing the licensing issues: https://civitai.com/articles/5671

Thanks. Will read - will probably improve my understanding as it's not really something I have followed.

Overly Serious · Jun 12, 2024

Baraadmirer said:
So is SD3 supposed to be less resource-intensive than Cascade or XL?

I don't know about Cascade. I only dabbled with that and it went like shit off a shovel (Bong expression meaning very fast). But for XL I was finding it around in the same as SDXL models on like hardware and resolutions. I could do some more formal comparisons if people like. The model itself is only 4GB. Base SDXL without the refiner is 6GB. But SD3 is structured differently with parts of the text encoding separated out. If you roll everything in together it ends up around 10GB. But pure file size or pure parameter number comparisons aren't a great guide to complexity and performance. I think the fact that SD3 is split up into smaller parts can make it quicker. I have 20GB VRAM though so I can just throw everything into that with room to spare. I'd be interested to see if it's more performant when you have more constrained hardware - I suspect it might be.

The Mass Shooter Ron Soye · Jun 12, 2024

Overly Serious said:
At present however, I feel they may have a point. I'll still give it time to see how it turns out. It's definitely better in some ways so it might be a better base for building higher in time. At present I feel they made a mistake in not releasing the 8B version. Having that out there would go a long way to deflating some of the criticisms as I don't think it's got anywhere near the same issues.

Looks like a disaster so far, and you could easily imagine the same issues extending to the 8 billion parameter model if the problem is that they attempted to make SD3 "safe". The coomers will probably end up being the ones to salvage it with retraining.
Ridiculed Stable Diffusion 3 release excels at AI-generated body horror (archive)

AI image fans are so far blaming the Stable Diffusion 3's anatomy failures on Stability's insistence on filtering out adult content (often called "NSFW" content) from the SD3 training data that teaches the model how to generate images. "Believe it or not, heavily censoring a model also gets rid of human anatomy, so... that's what happened," wrote one Reddit user in the thread.

BrunoMattei · Jun 12, 2024

Anyone have experience with LumaAI and any tips for prompts to make an image move?

std::string · Jun 12, 2024

macrodegenerate said:
Seems kind of bad, but I can understand why. I've trained LORAs that can create work that looks almost indistinguishable from the artists they are based on. It is probably discouraging for this to happen to artists who are good. So obfuscating tags makes sense to encourage artists to produce more content. I don't know why they wouldn't just remove those tags though.

The fun thing is that with a lot of these checkpoints you'll just get garbage unless you're trying to generate something in a specific artist's style. They've just been loaded up with so much noise that you need to clamp down on a specific tag to get something coherent.

Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

Rololowlo

Girl Named Sandoz

macrodegenerate

Generative AI was a mistake

Overly Serious

macrodegenerate

Generative AI was a mistake

BrunoMattei

No I am not the Cinema Snob

Overly Serious

Overly Serious

indomitable snowman

Can't stop, won't stop (unless it gets above 32°F)

BirdUp

The Worst Show on Television

DavidS877

Giant Meteor Goes to Washington

macrodegenerate

Generative AI was a mistake

Overly Serious

inception_state

Baraadmirer

💪🍦💪

Overly Serious

Overly Serious

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

BrunoMattei

No I am not the Cinema Snob

std::string