@verymuchawful Well you were right about what was possible. I was surprised at how able to run Flux locally I was.
Flux Dev. fp16, 1024x1024, 20 steps } 62 seconds, 2.85s/it
Flux Dev, fp8, 1024x1024, 20 steps } 64 seconds, 2.88s/it
(No, I have no idea why the fp8 took
longer than fp16. It's not due to model loading, this was consistent across runs).
Flux Dev, fp8, 512x512, 6 steps } 12 seconds, 1.08s/it.
I tried out the slightly cut down Comfy recommended one and it made no difference to times so far as I could tell nor how maxed out my VRAM was (I have 20GB). I also tried out Schnell and it gave me better output. I think something was giving out with Dev on my hardware as I would sometimes get blurred images. (No, it wasn't anything NSFW). And very, very weirdly it would seemingly hold onto elements from a previous run. Example, I ask for a drawing of a person with various details. I then change it to "photo of" and add "detailed, realistic", and it still gave me drawings. Swap to a different model and back and now it gives me realistic photos. I have no explanation for that at all. It shouldn't be possible but appeared to be the case.
The bulk of time for a generation was loading the model which it seemed to need to do every time, I guess perhaps VRAM was so tight that it freed it up the moment a run was over. I didn't try any of the ones you pointed at yet. And tbc, my view on how things are going long-term is the same just to be clear. However, was surprised I could run this (more or less) on my hardware.