Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
You use a checkpoint(YiffyMix) based off of a distilled model of SDXL(Lightning). Pony XL might actually improve your art if you give it a shot. The Lora diversity is impressive on civit.ai.
Right now I have a thing set up where I can either scrape images off e621 and have them made with captioned .txt files to generate loras with and also an auto-tagger for e621 tags. Moreover, I also have software installed than can autocaption images I feed it that don't necessarily need to be fed off e621. This is a good thing for me because certain stuff I like (not cub, that's all I care to say without hijacking this thread) isn't found off the e621 databases.

So, theoretically I have everything I need to make loras I simply do not have the experience needed to know what settings in particular need to be adjusted how in order to get things perfect.

So I have things I need and I also pretty much have enough familiarity with the technology to walk people through installation, the only thing I am missing is the precision for perfect final outputs. In general you want the capacity to make loras for each new model that comes out. Pony is indeed powerful, but its unwieldy. I don't believe that is a contentious opinion to have.

Moreover, there are certain things going on with the Pony team with regard to a degree of a lack of transparency that lead me to believe it is not a good option. I have spoken with a developer of the ChromaXL model (this model is great for my tastes) and he let it slip him and his team have certain ideas for the direction they wish to take their model that I believe even the Kiwifarms community would support, in spite of it being a furry model.
 
Saw L1 promoting an nvidia card for AI................8GB, for almost $400.....in 2024.

Better get a 4060Ti with 16GB for $50 more.
It really is tragic what they've done with pcie lanes on consumer boards. You used to be able to slot in at least 2 x8 lanes via 2 x 16 slots. Alas we are all stuck with only one GPU forever now. At the 4060ti's price point you could have two 3060s running 24 VRAM total.

Edit:Just checked the latest chipsets. You can run two GPUs on alot of them. Buy 2 3060s if you want to sac speed for VRAM. You are fucked on LLMs though due to the 3060s garbage memory bandwidth.
>check link
>literally all gay furry shit
Scrolling down, the Bara Garfield was kind of a shock
 
Last edited:
Just checked the latest chipsets. You can run two GPUs on alot of them. Buy 2 3060s if you want to sac speed for VRAM. You are fucked on LLMs though due to the 3060s garbage memory bandwidth.
If all I wanted was to fap to AI gay crap like homo donkey over there sure but right now I'm more interested in custom LLM models so the 4060 would be better. I heard there was going to be a 24GB 4070 but can't find it anywhere.
 
If all I wanted was to fap to AI gay crap like homo donkey over there sure but right now I'm more interested in custom LLM models so the 4060 would be better. I heard there was going to be a 24GB 4070 but can't find it anywhere.
Just buy a used 3090 at that point. They run around $750 on ebay, and give you the VRAM you want.
 
it will when those tariffs hit. you think the price for a 4090 is bad now?
Meh assembling a card is nothing and if the tariffs are really bad (very unlikely, its not 2016 anymore) it can be done here, the real problem is the chips specially GPU and those come from Taiwan but I doubt orange man good will put tariffs on those given it would fuck up every industry over here, cuz you can't just lift TSMC off the ground and place it here like a bicycle factory.
 
Isn't the 5090 around the corner? paying $750 for a GPU two gens behind don't seem like good business, hows the Radeon compatibility going? any improvements?
The 4070 is 3 times faster than the 3090 for AI; however, even the 4070 super ti doesn't have 24GB VRAM only the 4090 does. If you know your model is smaller something like Illama 3.1 8B or maybe Nemo 12B(pushing it at 12GB), you may be better off with a 4070. The issue is that if you can't fit your model you can't run it. Inb4 Illama.cpp RAM offloading, but god damn is that slow.

I want to run Mistral small/Flux, and I need 24GB of VRAM to do that. You might not.
 
Last edited:
Does anyone know how to train your own checkpoint on a silicone Mac? I’ve tried Loras but they don’t really work for what I want. Maybe I’m just not training or tagging them right.
 
Does anyone know how to train your own checkpoint on a silicone Mac? I’ve tried Loras but they don’t really work for what I want. Maybe I’m just not training or tagging them right.
LORAs are realistically the best you can do on a silicon Mac. How are you tagging them? What model are you training on?
 
I want to run Mistral small/Flux, and I need 24GB of VRAM to do that.
24GB for mistral small? fuuuuuck, anyway isn't 12GB enough for schnell? I seen people running it on that much RAM.

And yeah there's nothing below xx90 with 24GB, the 4060 ti comes with 16GB tho its GDDR6, the 4070 has GDDR6X, don't know how much of a difference it makes.
 
24GB for mistral small? fuuuuuck, anyway isn't 12GB enough for schnell? I seen people running it on that much RAM.

And yeah there's nothing below xx90 with 24GB, the 4060 ti comes with 16GB tho its GDDR6, the 4070 has GDDR6X, don't know how much of a difference it makes.
The bigger issue, imo, is context. With just 24GB your context will be tiny, think 2-5k. The 128k of GPT4o it is not, and even that isn’t quite enough for my D&D sessions.
 
  • Like
Reactions: Vecr
24GB for mistral small? fuuuuuck, anyway isn't 12GB enough for schnell? I seen people running it on that much RAM.

And yeah there's nothing below xx90 with 24GB, the 4060 ti comes with 16GB tho its GDDR6, the 4070 has GDDR6X, don't know how much of a difference it makes.
Schnell is hot garbage, and you need to offload/reload the text decoder if you don't have enough VRAM.

GDDR6X is a big enough difference to go to the 4070. That's at least a 30% memory bandwidth increase.
 
and even that isn’t quite enough for my D&D sessions.
That's what you're using it for?
Schnell is hot garbage, and you need to offload/reload the text decoder if you don't have enough VRAM.
I haven't been keeping up with FLUX, wasn't schnell the only version that could be run locally and the rest you had to pay BFL? did they make all other tiers available for local? And I seen some good stuff with schnell, tho when I tested it it took many tries to get a good result with legible text.
GDDR6X is a big enough difference to go to the 4070. That's at least a 30% memory bandwidth increase.
The explains the price, then again and like I said like 5 pages ago I still don't get why we the entire glut of used 3xxx cards after the crypto bust the chinese aren't making 18-24GB (or more) versions with those. A ytuber already showed its possible, it just takes labor and the chinese can do it since they been upgrading shitty 580 Radeons from 2 to 8GB for a while, a 4x increase that's pointless because the GPU is the bottleneck there but my point is they got the economies of scale to do it, and yet they aren't even tho there's real demand not just for AI but from gamers because 8-12GB is now too little.
 
The 4070 is 3 times faster than the 3090 for AI; however, even the 4070 super ti doesn't have 24GB VRAM only the 4090 does. If you know your model is smaller something like Illama 3.1 8B or maybe Nemo 12B(pushing it at 12GB), you may be better off with a 4070. The issue is that if you can't fit your model you can't run it. Inb4 Illama.cpp RAM offloading, but god damn is that slow.

I want to run Mistral small/Flux, and I need 24GB of VRAM to do that. You might not.
The explains the price, then again and like I said like 5 pages ago I still don't get why we the entire glut of used 3xxx cards after the crypto bust

3090s have nvlink and 2 together gets you 48GB of VRAM for your LLM. Not possible with 4090 or 5090 or anything consumer going forward.
 
I've been running Flux on 8GB cards with ComfyUI... I guess much of the work is deferred to system RAM and it's slow at about 120 seconds/image but it does work. 64GB system RAM but one of mine has 32GB and does fine too. Usually I just set a batch and do something else for a while.

30 steps, flux dev, 8GB 3070. 1088x1440 "Prompt executed in 131.99 seconds"
SNEED_3611.jpg
 
Back