Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

didelphigina · Nov 27, 2025

Slav Power said:
FLUX.2: Frontier Visual Intelligence
The dev model is 64GB so you'll have to hope for some quants to come out later down the line.

EDIT: There are GGUF quants for it already out lol. 5-bit for 24GB havers and 6-bit for 32GB havers.

Slurred said:
Z Image Turbo was released yesterday. It's a new Chinese model trained with 6 billion parameters (for comparison, Flux 2 has 32 billion). Uses a small Qwen LLM for text encoding, allowing natural language prompts with surprisingly good understanding. Text, prompt adherence, aesthetics, hands, everything is really, really good for a model that will run easily on 16GB VRAM. It doesn't know what a kiwi bird is though.
View attachment 8221756
HuggingFace model page
ComfyUI workflow

A base (i.e. non-turbo) and edit model are still to be released.

I'll take six billion over 32-64 billion any day. I tested it on my 3060. It works nicely on my 8gb card. Average generation time is around less than a minute.

Though, I feel it'll become the same song and dance with the whole "Look how REALISTIC™ it can make this 1girl shot! It beats (model everyone previously glazed); no competition!" shtick and then it's run-on-the-mill prompts or a vaguely attractive woman staring at you. I love messing with realism myself but there's experimenting and testing with what it's capable of and then there's glazing because it can do basic prompts but slightly more visually appealing.

Slav Power · Nov 27, 2025

Chairman Xi did it again

MennilTossFlykune · Nov 30, 2025

Slurred · Dec 1, 2025

Overly Serious said:
What would be the most interesting models to play with - focus mostly realistic photos (I'm not talking about porn) and maybe some supernatural elements - I thought I might make illustrations for some RPG stuff, character portraits, things like that. But also just for fun to push the boundaries of what these models are capable of now. Would be fun to do some hyper-realistic character portraits that were short animations as well.

Both Flux 2 and the new Z Image model would be suitable for this, but Flux 2 is very heavy on system resources, I wouldn't bother with it unless you have at least 24GB VRAM, and even then you'll need to use a quantized version. In the realm of slightly older models, Qwen Image and Flux Krea, a finetune of Flux 1 Dev specifically designed to minimize the "AI look", should also work well. WAN 2.2 seems to be the video model everyone's using if you want to dabble with that. I haven't used it much myself.

As far as closed-source services go, Nano Banana Pro seems to be the ruler of the roost right now.

Overly Serious said:
Also am I write in surmising that LoRAs and fine-tunes are less of a thing now? All the big discussion now seems to be around huge models like Flux 2 and such and I don't think there's the same community modding/LoRA/fine-tune sort of scene around those due to size? But maybe I've just not been looking.

Every model that's been out for a while has some amount of finetunes and LoRAs available. The bigger models don't have quite as many as SDXL and its derivatives, but there's still plenty around. Z Image Turbo hasn't even been out a week and there are already dozens of LoRAs for it on Civitai.

Though it is true that with more capable base models, there is less of a need for LoRAs. My unscientific, vibes-based opinion is that you see a lot more style LoRAs now, whereas before you'd have LoRAs for poses or things like five-fingered hands, and this reflects how models have progressed from fucking up human anatomy in a wide variety of artistic styles to mostly getting hands and proportions correct in a few basic styles.

didelphigina said:
Though, I fell it'll become the same shtick with the whole "Look how REALISTIC™ it can make this 1girl shot! It beats (model everyone previously glazed); no competition!"shtick and then it's run-on-the-mill prompts or a vaguely attractive woman staring at you. I love messing with realism myself but there's experimenting and testing with what it's capable of and then there's glazing because it can do basic prompts but slightly more visually appealing.

The devs have said themselves that Z Image Turbo is essentially an aesthetic finetune of their base model. And right now it's getting a lot of credit just because it mostly doesn't give people plastic skin, a problem that has continued to dog the base Flux models into the era of Flux 2. I've found it to be pretty good with fairly complex prompts featuring text, multiple distinct characters, different poses, five-fingered hands, coherent backgrounds, etc... and it does so a lot faster than other models with only slightly better prompt understanding.

Overly Serious · Dec 1, 2025

Slurred said:
Both Flux 2 and the new Z Image model would be suitable for this, but Flux 2 is very heavy on system resources, I wouldn't bother with it unless you have at least a 24GB VRAM, and even then you'll need to use a quantized version. In the realm of slightly older models, Qwen Image and Flux Krea, a finetune of Flux 1 Dev specifically designed to minimize the "AI look", should also work well. WAN 2.2 seems to be the video model everyone's using if you want to dabble with that. I haven't used it much myself.

As far as closed-source services go, Nano Banana Pro seems to be the ruler of the roost right now.

Thanks for that. I have managed to run Flux1.dev locally. It certainly wasn't instant but it completed in about 7mins iirc. Good enough to try out just to see it work, but not enough for blasting out concepts and seeing the effect of changes quickly. I also managed to get WAN 2.2 working locally but again, quite a while to wait for a short clip. Impressive that it's possible at all, though. What I was actually thinking was rent some time on a site like Runpod and just have at it with some higher end hardware and high VRAM. Given that it's paid time and also that it takes a while to upload the models to the storage (which I also pay for), I want to have a clear plan for what I want to try out. What I'd like to do is some realistic character portraits that were animated. I think that would be fun.

Slurred said:
Every model that's been out for a while has some amount of finetunes and LoRAs available. The bigger models don't have quite as many as SDXL and its derivatives, but there's still plenty around. Z Image Turbo hasn't even been out a week and there are already dozens of LoRAs for it on Civitai.

Though it is true that with more capable base models, there is less of a need for LoRAs. My unscientific, vibes-based opinion is that you see a lot more style LoRAs now, whereas before you'd have LoRAs for poses or things like five-fingered hands, and this reflects how models have progressed from fucking up human anatomy in a wide variety of artistic styles to mostly getting hands and proportions correct in a few basic styles.

What I recall was there used to be an absolute tonne of LoRAs for particular styles and characters, fictional, celebrity, fantasy or sci-fi species, aesthetics for particular movie styles. At least there were in the SDXL era.

Ha! I just decided to try looking at Civitai to see what the LoRA ecosystem was for newer models. Got a big message telling me it was restricted for Bongland visitors. :suffering:

It's okay, OFCOM - I wasn't looking for the dodgy stuff, I just wanted to know if there were good Sci-Fi aesthetics. Yeesh!

Soggy Floppa · Dec 1, 2025

after playing with it for a bit, I predict Z Image will take off like Illustrious did (and unlike Flux which failed to really take off) once more of the official weights and tooling gets released, nothing else seems to have really dethroned SDXL base models quite yet. It depends on how cheap and fast the lora training can become and how versatile the model can be.

Overly Serious · Dec 1, 2025

Soggy Floppa said:
after playing with it for a bit, I predict Z Image will take off like Illustrious did (and unlike Flux which failed to really take off) once more of the official weights and tooling gets released, nothing else seems to have really dethroned SDXL base models quite yet. It depends on how cheap and fast the lora training can become and how versatile the model can be.

Just had a bit of a play with it now following @Slurred 's reply above and does seem pretty good.

Jacknife · Dec 1, 2025

MennilTossFlykune said:
View attachment 8235106

LMAO Palm trees? And why are the Heather Flowers 5-times as tall.

Pretty cool tho, got major nostalgia as I saw it

The Mass Shooter Ron Soye · Dec 10, 2025

AMD supported Amuse AI software goes open source, works with all GPU vendors

Amuse AI just went open source on GitHub, but it is not actually an AMD-made app. It is built by TensorStack AI, a small startup that works with AMD as a software partner and gets promoted on AMD’s site as a recommended front end for Ryzen AI and Radeon hardware. The GitHub repo calls this the “final version”, so this looks more like a curtain call than the start of a new development cycle.

To be honest, it looks like the project was effectively ended months ago. The last update was in April about a collab between AMD, Stability AI and AmuseAI:

The Mass Shooter Ron Soye · Dec 11, 2025

Disney Accuses Google of Using AI to Engage in Copyright Infringement on ‘Massive Scale’ (archive)

As Disney has gone into business with OpenAI, the Mouse House is accusing Google of copyright infringement on a “massive scale” using AI models and services to “commercially exploit and distribute” infringing images and videos.

Mikoyan · Dec 11, 2025

The Mass Shooter Ron Soye said:
Disney Accuses Google of Using AI to Engage in Copyright Infringement on ‘Massive Scale’ (archive)

And then licenses out to OpenAI. Big hmmmmmmmmmmmmm.

Mike Matei's Penis · Dec 11, 2025

Sora and Disney? Where have we seen this foretold before?

The Mass Shooter Ron Soye · Dec 12, 2025

Mikoyan said:
And then licenses out to OpenAI. Big hmmmmmmmmmmmmm.

Some analysis: https://lite.cnn.com/2025/12/11/business/disney-openai-hedge

grump · Dec 13, 2025

Z-Image Base is confirmed to be dropping soon

https://tongyi-mai.github.io/Z-Image-blog/

For those unfamiliar this is a new SOTA tunable image generator with nanobanana/flux like editing capabilities but opensource without being nerfed into the floor like western generators. Huge news not only for NSFW prompters but those who just want to make a silly picture of random internet celebs and are tired to g00gles random refusals to do completely innocent things and black forest's constant cringe bragging about how nerfed their models are and how difficult it is to overcome the censorship.

Freeing people from having to rely on gimped Western cloud models is a rare chinese win. The only issue for me is at 6 billion parameters I'm sure they have or capable of a much larger superior model that hopefully could be released soon as well

The Mass Shooter Ron Soye · Dec 14, 2025

grump said:
Freeing people from having to rely on gimped Western cloud models is a rare chinese win. The only issue for me is at 6 billion parameters I'm sure they have or capable of a much larger superior model that hopefully could be released soon as well

whatever I feel like · Dec 14, 2025

Slurred said:
Z Image Turbo was released yesterday. It's a new Chinese model trained with 6 billion parameters (for comparison, Flux 2 has 32 billion). Uses a small Qwen LLM for text encoding, allowing natural language prompts with surprisingly good understanding. Text, prompt adherence, aesthetics, hands, everything is really, really good for a model that will run easily on 16GB VRAM. It doesn't know what a kiwi bird is though.
View attachment 8221756
HuggingFace model page
ComfyUI workflow

A base (i.e. non-turbo) and edit model are still to be released.

Try "pregnant grinch bird", thats what I always used to use to get kiwis.

Drain Todger · Dec 25, 2025

The Mass Shooter Ron Soye said:
It's over?

Introducing Nano Banana Pro (archive)

View attachment 8202249 View attachment 8202247 View attachment 8202254 View attachment 8202256 View attachment 8202257 View attachment 8202258 View attachment 8202259 View attachment 8202260 View attachment 8202261 View attachment 8202264 View attachment 8202265 View attachment 8202266 View attachment 8202270 View attachment 8202271 View attachment 8202272 View attachment 8202276 View attachment 8202277

CNET: Google's Nano Banana Pro Makes Ultrarealistic AI Images. It Scares the Hell Out of Me (archive)

Wired: Hands On With Google’s Nano Banana Pro Image Generator (archive)

Nano Banana Pro is fucking amazing. What the fuck kind of black magic is this, Google?

"Gordon Freeman is holding Twilight Sparkle by the midsection and using her as a living weapon. She is firing off her purple horn beam and looks very irritated. She is approximately 1.1 meters in height and looks bulky and difficult to wield. The whole thing is rendered in a realistic, painterly CG portrait style. No text."

A tribute to Gloverfield the Gardevoir from Omeger Rubyer:

"Do Gloverfield, the chain-smoking yankee Gardevoir, having a cigarette and squatting in a school hallway like a delinquent. She is a literal Gardevoir, not a human. She is rendered in a CG painting style."

Gemini_Generated_Image_5l2vxr5l2vxr5l2v (1).jpg

"Her head tufts should be more upright and predatory looking, and her eyes should have a bloodshot, psychotic, murderous look to them."

Same energy:

Me Gusta HD:

The Mass Shooter Ron Soye · Dec 25, 2025

Drain Todger said:
Nano Banana Pro is fucking amazing. What the fuck kind of black magic is this, Google?

It looks like the endgame for image generation/synthesis is in sight, particularly with the editing features that maintain coherence. I could easily see Photoshop working like that demo, except you import things as layers, and do the AI operations on selected layers.

The miscellaneous text and logos (and tramp stamp) are pretty good in your gens but could probably look better with some inpainting. Google clearly cherry picked some examples where text was "in focus" and getting a lot of attention from the model. Background changes away from the head in your Gardevoir example look very subtle but present. I need to check on another device with a better screen.

For the slavs of the world, what you get in a premium model should be free/local within 6-24 months. So it's worth looking at what these models are capable of. Seems like the "IP infringement" capability is working well, despite Disney suing Google a couple weeks ago.

Drain Todger · Dec 25, 2025

The Mass Shooter Ron Soye said:
It looks like the endgame for image generation/synthesis is in sight, particularly with the editing features that maintain coherence. I could easily see Photoshop working like that demo, except you import things as layers, and do the AI operations on selected layers.

The miscellaneous text and logos (and tramp stamp) are pretty good in your gens but could probably look better with some inpainting. Google clearly cherry picked some examples where text was "in focus" and getting a lot of attention from the model. Background changes away from the head in your Gardevoir example look very subtle but present. I need to check on another device with a better screen.

For the slavs of the world, what you get in a premium model should be free/local within 6-24 months. So it's worth looking at what these models are capable of. Seems like the "IP infringement" capability is working well, despite Disney suing Google a couple weeks ago.

You can already do this. Photoshop already has Nano Banana integration and you can drag-select and inpaint specific parts of an image with generative Nano Banana calls, confine it to specific layers, etc.

The reason why the background looks crappier and crappier with each consecutive edit is because, unfortunately, Nano Banana's edits are lossy. If you run 5+ consecutive edits on the same image, the whole thing looks like muddy crap.

The Mass Shooter Ron Soye · Dec 25, 2025

Drain Todger said:
You can already do this. Photoshop already has Nano Banana integration and you can drag-select and inpaint specific parts of an image with generative Nano Banana calls, confine it to specific layers, etc.

*I could easily see The GIMP 4.0.0 working like that... in 2040.

Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

didelphigina

Favorable Cluster-B Personality Enjoyer

FLUX.2: Frontier Visual Intelligence

Slav Power

MennilTossFlykune

Slurred

Overly Serious

Soggy Floppa

Overly Serious

Jacknife

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

Mikoyan

Christmoose Example

Mike Matei's Penis

Posting from my computer book.

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

grump

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

whatever I feel like

Koopa Lives Matter

Drain Todger

ACKNOWLEDGE // SUBMIT

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

Drain Todger

ACKNOWLEDGE // SUBMIT

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

Favorable Cluster-B Personality Enjoyer

FLUX.2: Frontier Visual Intelligence​

Shit nobody cares about Expert ✅

Shit nobody cares about Expert ✅

Christmoose Example

Posting from my computer book.

Shit nobody cares about Expert ✅

Shit nobody cares about Expert ✅

Koopa Lives Matter

ACKNOWLEDGE // SUBMIT

Shit nobody cares about Expert ✅

ACKNOWLEDGE // SUBMIT

Shit nobody cares about Expert ✅

FLUX.2: Frontier Visual Intelligence