Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
I'm glad I've finally figured out how to get working on my setup, but I'm real wary of updating from Mint 21.3 to 22. It took me a couple weeks of on and off troubleshooting before I figured what was going wrong. Even now I had to figure out what was causing random crashes on the Automatic1111 UI before I found this launch script after digging through the ROCM github issues page. I'm glad it's working, and I can generate some 2048x2048 and bigger images with Tiled Vae, but I want to make sure it never breaks again and not have to deal with it. My fault for going AMD, but I was using it for mostly gaming before I got back into SD and the 7800xt is a good card for that, especially on Linux.
Mint 22 deploys with Python 3.12, which SD.Next would not start with and I was too stupid to figure out how to get a parallel install of 3.11 working. Very frustrating. What are your iteration times like with 2048 pixel images?
 
  • Feels
Reactions: Stalphos Johnson
Mint 22 deploys with Python 3.12, which SD.Next would not start with and I was too stupid to figure out how to get a parallel install of 3.11 working. Very frustrating. What are your iteration times like with 2048 pixel images?
Yeah, I was worried about something like that. I thought SD.Next uses venv so that shouldn't be a problem, but who knows? I use Tiled VAE to cheat. I can't really generate 2048x2048 images without it. It's a little difficult to answer iterations because Tiled VAE goes through several steps to reduce vram usage. Usually, I get around 0.2-0.3 it/s running with a tile height and width of 128, but some steps run faster or slower. Depending on what you're making, you may need to increase the tile size and that can drop it to 0.05 to 0.1 it/s.
 
Last edited:
  • Informative
Reactions: Post Reply
Mint 22 deploys with Python 3.12, which SD.Next would not start with and I was too stupid to figure out how to get a parallel install of 3.11 working. Very frustrating. What are your iteration times like with 2048 pixel images?
That affected a bunch of Linux distros recently, as they all moved to 3.12. It's generally pretty easy to fix, as the distro will either have 3.11/3.10 as installable packages, you can use the deadsnakes ppa (for Debian-based distros) or just build python yourself from source (it's less bad than it sounds, basically just 5-10 minutes of following instructions on github).

You might have to recreate your venv which is annoying, but it's a fairly minor inconvenience.
 
If I hear one more retard claim that image diffusion models are "dangerous" I swear to god. If people look at the image below and think for a single fucking second that it could be real, that demonstrates a much worse problem with society at large being fucking retarded. Fucking stupid meme images aren't the thing people should worry about. It's what you see everyday that you actually can't determine as being generative AI or not that's actually scary.
View attachment 6317271
Is this video real?

 
News hams (breaking the news)
News hams (off the clock)
1.jpg2.jpg4.jpg5.jpg6.jpg7.jpg8.jpg9.jpg3.jpg
 
What were both the LLM summarized and original prompts for the harpy?
I wish I remember, I copy and pasted between two windows basically. The one thing I distinctively remember is that the LLM rewrites were sometimes quite flowery and verbose in that typical GPT style they all have, but this didn't seem to hurt the interpretation by flux much at all, quite contrary even. The other post quoted sheds insight on this.

All of it can be summarized as: It's a transformer that was trained on GPT4 captions to a much much greater extent than SD3.
Yes, and that is really noticeable. Thank you for the detailed writeup, it was interesting. Don't excuse being knowledgable and interested in a subject as autism. Knowing things is a good thing.

--
I still haven't really played around with flux dev but I intend to do so next week.
 
  • Informative
Reactions: Vecr
Someone has set up a microsite generating images with Flux. No account needed or anything for the moment.
I like comparing fastflux outputs to the demo up on hugging face. You are limited to a few images per hour but FF seems to be tweaked a fair bit. The results are more in line with "what people want" than the blackforest demo, but the demo will do 'better' work for autistic detail, vector art etc. As far as I can tell from my ham-fisted prompting.



~6GB VRAM fork is available now too for running FLUX locally if building a GPU data center with someone else's money is out of your budget.
 
Last edited by a moderator:
https://x.com/_akhaliq/status/1828631472632172911

Google presents Diffusion Models Are Real-Time Game Engines

discuss: https://huggingface.co/papers/2408.14837

We present GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality. GameNGen can interactively simulate the classic game DOOM at over 20 frames per second on a single TPU. Next frame prediction achieves a PSNR of 29.4, comparable to lossy JPEG compression. Human raters are only slightly better than random chance at distinguishing short clips of the game from clips of the simulation. GameNGen is trained in two phases: (1) an RL-agent learns to play the game and the training sessions are recorded, and (2) a diffusion model is trained to produce the next frame, conditioned on the sequence of past frames and actions. Conditioning augmentations enable stable auto-regressive generation over long trajectories.

 
Last edited:
I have some sketches I'm drawing for an upcoming project, schematics etc. What's the best option for feeding sketches into a model and having it spit out variations of a final product based on the sketch. I tried ChatGPT but it seems pretty far behind as far as image generation.
 
Back