- Joined
- Apr 6, 2024
FLUX.2: Frontier Visual Intelligence
The dev model is 64GB so you'll have to hope for some quants to come out later down the line.
EDIT: There are GGUF quants for it already out lol. 5-bit for 24GB havers and 6-bit for 32GB havers.
Z Image Turbo was released yesterday. It's a new Chinese model trained with 6 billion parameters (for comparison, Flux 2 has 32 billion). Uses a small Qwen LLM for text encoding, allowing natural language prompts with surprisingly good understanding. Text, prompt adherence, aesthetics, hands, everything is really, really good for a model that will run easily on 16GB VRAM. It doesn't know what a kiwi bird is though.
View attachment 8221756
HuggingFace model page
ComfyUI workflow
A base (i.e. non-turbo) and edit model are still to be released.
I'll take six billion over 32-64 billion any day. I tested it on my 3060. It works nicely on my 8gb card. Average generation time is around less than a minute.
Though, I feel it'll become the same song and dance with the whole "Look how REALISTIC™ it can make this 1girl shot! It beats (model everyone previously glazed); no competition!" shtick and then it's run-on-the-mill prompts or a vaguely attractive woman staring at you. I love messing with realism myself but there's experimenting and testing with what it's capable of and then there's glazing because it can do basic prompts but slightly more visually appealing.



It's okay, OFCOM - I wasn't looking for the dodgy stuff, I just wanted to know if there were good Sci-Fi aesthetics. Yeesh!
