Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

inception_state · Aug 13, 2024

Overly Serious said:
@verymuchawful Well you were right about what was possible. I was surprised at how able to run Flux locally I was.

It's a bit slow but doable. I'm getting around 1 iteration/sec with Flux Dev, fp16, on a 4090. It's using about 22 GB of VRAM though.

The Tall Man said:
Can someone who played around with Flux tell me if it can copy artists style and does it know specific people?
Could it create a drawing of George Floyd punching Elizabeth Olsen in the stomach in the style of Todd McFarlane?

It seems like we will need LoRAs for specific people and styles. It's awful at celebrities, likely intentionally. It's okay at generic styles, but not specific artist styles. Eg. putting in your prompt it returned:

The composition and relation between objects is a huge improvement though. It's much closer to the Microsoft image generator in terms of being able to accurately depict how things in the image relate to each other. Once we get some fine-tunes and LoRAs it's going to be incredible.

Eg.

draw a cartoon of a clown riding on a unicycle, juggling a torch, a sword, and a bowling ball.

First try and it's dead-on.

whatever I feel like · Aug 13, 2024

inception_state said:
It's a bit slow but doable. I'm getting around 1 iteration/sec with Flux Dev, fp16, on a 4090. It's using about 22 GB of VRAM though.

It seems like we will need LoRAs for specific people and styles. It's awful at celebrities, likely intentionally. It's okay at generic styles, but not specific artist styles. Eg. putting in your prompt it returned:
View attachment 6304070

The composition and relation between objects is a huge improvement though. It's much closer to the Microsoft image generator in terms of being able to accurately depict how things in the image relate to each other. Once we get some fine-tunes and LoRAs it's going to be incredible.

Eg.

View attachment 6304060
First try and it's dead-on.

Mine gave me the same Bob Vila guy for Floyd, I wonder if something weird was in the images it trained on.

Maybe saying Family Guy style would have worked better than Seth McFarlane style.

Edit: I am now realizing that Todd is a very different McFarlane, so generic comicbook style is acceptable, if not perfect.

DeadwastePrime · Aug 13, 2024

inception_state said:
Eg.

First try and it's dead-on.

if we're being technical, the clown's legs arent peddling the thing

Train Operator · Aug 14, 2024

whatever I feel like said:
it can't draw an accurate Jar Jar Binks to save its life.

that's why pony is still king

whatever I feel like · Aug 14, 2024

Train Operator said:
that's why pony is still king

View attachment 6308557

Now have him lovingly embracing Shrek.

The Mass Shooter Ron Soye · Aug 14, 2024

Artists claim “big” win in copyright suit fighting AI image generators (archive)

In an order on Monday, US district judge William Orrick denied key parts of motions to dismiss from Stability AI, Midjourney, Runway AI, and DeviantArt. The court will now allow artists to proceed with discovery on claims that AI image generators relying on Stable Diffusion violate both the Copyright Act and the Lanham Act, which protects artists from commercial misuse of their names and unique styles.

Catgirl Tyranid · Aug 14, 2024

The Mass Shooter Ron Soye said:
Artists claim “big” win in copyright suit fighting AI image generators (archive)

Kill copyright. Behead copyright. Roundhouse kick copyright into the concrete. Slam dunk copyright lovers into the trashcan. Crucify the filthy Lanham Act. Defecate in copyright's food. Launch copyright into the sun. Stir fry copyright in a wok. Toss copyright into active volcanoes. Urinate into a copyright lover's gas tank. Judo throw copyright into a wood chipper. Twist copyright'ss heads off. Report copyright's to the IRS, wait not that one. Karate chop copyright in half. Curb stomp the Copyright Act. Trap copyright lovers in quicksand. Crush copyright in the trash compactor. Liquefy copyright in a vat of acid. Eat copyright. Dissect copyright. Exterminate copyright in the gas chamber. Stomp copyright skulls with steel toed boots. Cremate copyright in the oven. Lobotomize copyright. Mandatory abolition of copyright. Grind copyright in the garbage disposal. Drown copyright in fried chicken grease. Vaporize copyright with a ray gun. Kick old copyright down the stairs. Feed copyright to alligators. Slice copyright with a katana.

inception_state · Aug 15, 2024

All images are made for the purpose of parody, etc, etc. People are already training real person LoRAs for Flux and they are pretty good.

Susanna · Aug 15, 2024

The Mass Shooter Ron Soye said:
Artists claim “big” win in copyright suit fighting AI image generators (archive)

Learn to mine coal.

The Mass Shooter Ron Soye · Aug 15, 2024

Musk’s new Grok upgrade allows X users to create largely uncensored AI images (archive)

Susanna said:
Learn to mine coal.

Learn to polish the knobs of coal-mining robots.

whatever I feel like · Aug 15, 2024

The Mass Shooter Ron Soye said:
Musk’s new Grok upgrade allows X users to create largely uncensored AI images (archive)

View attachment 6310177 View attachment 6310178

Learn to polish the knobs of coal-mining robots.

Oh no, how awful, now people will confuse that AI generated fake news image of Donald Trump surfing on an airplane with cat girls with the real thing. We must protect Our Democracy and ban this.

Post Reply · Aug 15, 2024

https://fastflux.ai/

https://www.reddit.com/r/StableDiffusion/comments/1esxa5y/generating_flux_images_in_near_realtime/

Someone has set up a microsite generating images with Flux. No account needed or anything for the moment.

Synthetic Smug · Aug 15, 2024

The Mass Shooter Ron Soye said:
Musk’s new Grok upgrade allows X users to create largely uncensored AI images (archive)

View attachment 6310177 View attachment 6310178

Learn to polish the knobs of coal-mining robots.

Can confirm Grok is pretty good, just as long as you don't ask for anything too exacting or technical. Code assist also seems not terrible.

whatever I feel like · Aug 15, 2024

Post Reply said:
https://fastflux.ai/

https://www.reddit.com/r/StableDiffusion/comments/1esxa5y/generating_flux_images_in_near_realtime/
Someone has set up a microsite generating images with Flux. No account needed or anything for the moment.
View attachment 6311196

I still can't get any Binks images. This thing has effectively erased Jar Jar Binks from history. I'm so pissed.

Catgirl Tyranid · Aug 16, 2024

whatever I feel like said:
I still can't get any Binks images. This thing has effectively erased Jar Jar Binks from history. I'm so pissed.

Now get rid of Shrek.

AmpleApricots · Aug 16, 2024

Ok I played around a bit with the fast version someone here posted a website for and what directly really jumps out at you is how complex your text prompt can be. You can just heap on detail (within reason) and it will try to do it.

It grasps abstract stuff really really well, you can build all sorts of weird things and they will remain fairly consistent, really showing off it's good conceptual understanding. What I found amazing is that you just kinda describe something to it and that way, build it. This creature is not a thing (I am aware of) I "invented" it purely by describing it. It even transfers seamlessly into other styles.

It's also a good wallpaper generator. Watermark inclusive.

Everything that is just remotely cartoon style tends to slip into big breasted anime girls really quickly. A negative prompt probably would help a lot here.

Another example of mixing of things that usually don't belong together and clean conceptual understanding making it possible. It needed a bit of wrangling to not turn the statue into a normal woman, but it worked really well. Verbose prompts are the key.

I'm very impressed, looking forward to having time to play with the dev version. This really lifted imagen to another level. It's funny considering that artists are still malding about SD and are probably not very aware that this exists.

verymuchawful · Aug 17, 2024

If I hear one more retard claim that image diffusion models are "dangerous" I swear to god. If people look at the image below and think for a single fucking second that it could be real, that demonstrates a much worse problem with society at large being fucking retarded. Fucking stupid meme images aren't the thing people should worry about. It's what you see everyday that you actually can't determine as being generative AI or not that's actually scary.

Blade of Grass · Aug 17, 2024

verymuchawful said:
If I hear one more retard claim that image diffusion models are "dangerous" I swear to god. If people look at the image below and think for a single fucking second that it could be real, that demonstrates a much worse problem with society at large being fucking retarded. Fucking stupid meme images aren't the thing people should worry about. It's what you see everyday that you actually can't determine as being generative AI or not that's actually scary.
View attachment 6317271

Do you think those are believable? I've played with using fastflux and putting "shaky, blurry photo" into the prompt. Do you think one could start hoaxes with these?

donald_trump_in_the_motion_of_sitting_down_into_a_black_car_near_white_house__shot_from_behind...jpg

missile_shot_down_by_aa_above_white_house__shaky__blurry_photo_of_the_sky.jpg

small_explosion_in_the_middle_of_the_sky_very_high_above_white_house__shaky__blurry_photo_of_t...jpg

AmpleApricots · Aug 17, 2024

So what I noticed with Flux is that simple language seems to be actually somewhat determinal in an indirect way, if you repeat yourself a lot (which will increase the length of the instruction) this will lead to confusion which expresses itself very similar how it expresses itself in "normal" LLMs (ignoring parts of the instruction), also distorted, "confused" images like in SD with repetition. I would not discourage repetition altogether though, sometimes it does help to "drive a point home". Generally it seems to be better to restructure the prompt, though.

An interesting thing to attempt is to let an AI rewrite the prompt. I discounted automatic prompting with SD largely because in my experimenting, it simply did not lead to good results if you didn't feed the AI all the right keywords, and at that point, you might as well write the prompt yourself. It seems to work well with Flux though. If you consider that the images were probably categorized by AI in a conversational way as theorized in this thread earlier, it makes sense that another AI would find the "right language" (perhaps GPTisms?) to get exactly what you asked for.

Prompting Flux is very different from prompting SD and all the other models that came out (and perhaps MJ and Dall-E, never used those) and for optimal results instead of looking for the right keywords (which often, simply do not exist), it makes much more sense to just describe what you want. I know this has already been said on this very page, just for the sake of completeness.

For example:

(video game adverts soon be like)

This is a comic style artwork of a generic fantasy harpy. When using the word "Harpy", "D&D Harpy" etc. flux usually generated a harpy eagle or some confused mashup of a bird and a woman, a woman holding a bird etc.. sometimes some anime abomination which seems to be the Flux fallback way (SDs fallback way was 00s style 3D renders). D&D style humanoid monsters are sort of my personal benchmark because most models really struggle with them, especially if they are half- something. So I just described the character, down to having a four-fingered claw like hand. The word harpy or even bird was not used once. After writing my verbose prompt I had it summarized by an LLM and it seemed to bring the perplexity down, I felt there were a lot less "failed" generations than with my handwritten prompt. Might be placebo, might be my ESL - I did not test it nearly long enough - but interesting if true.

(broken off arm prompt! worked only in like 1-2 of 10 times though)

I think this is what the kids call having an aesthetic. (I love that this thing can actually do dark scenes) in stable diffusion models, some things are *always* lighted from the same direction. This leads to some scenes looking so unnatural. If you see it once, you cannot unsee it.

This concludes my post. Thanks for reading my blog!

Vecr · Aug 17, 2024

AmpleApricots said:
The word harpy or even bird was not used once. After writing my verbose prompt I had it summarized by an LLM and it seemed to bring the perplexity down, I felt there were a lot less "failed" generations than with my handwritten prompt. Might be placebo, might be my ESL - I did not test it nearly long enough - but interesting if true.

What were both the LLM summarized and original prompts for the harpy?

Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

inception_state

whatever I feel like

Mushroom Kingdom Uber Alles!

DeadwastePrime

pronouns in bio

Train Operator

conductor on the glow rail

whatever I feel like

Mushroom Kingdom Uber Alles!

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

Catgirl Tyranid

inception_state

Susanna

Ruin is inevitable, and all else is prelude

The Mass Shooter Ron Soye

You CAN'T NOT DO IT!

whatever I feel like

Mushroom Kingdom Uber Alles!

Post Reply

Don't leave that comment in draft

Synthetic Smug

whatever I feel like

Mushroom Kingdom Uber Alles!

Catgirl Tyranid

AmpleApricots

verymuchawful

Enjoy prison, sticker child.

Blade of Grass

AmpleApricots

Vecr

DM if I don't respond.