Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

The base models of SD have never been super great at producing aesthetically-pleasing genitalia.
They are not super great, on the general. I have a hard time believing that under the hood DALL-E 4 and other commercial models don't use multiple fine-tuned models, their choice based on the nature of your prompt. Remember that Stability AI is not "giving away" everything. They are still a business and they make money off premium services, but the parts they do provide are of course crucial to develop anything at all. This is where user-created models, LoRAs, workflows, and other bits come in, which you absolutely should use depending on what you want to create. Even the base Stable Diffusion XL model sucks hard, despite what a massive improvement it is over its predecessor. You download Dreamshaper XL, suddenly you can create amazing stylised portraits and landscapes.
 
how the hell do you use novelai's control tools
i added a drawing to the ui and clicked palette swap then enabled "add more detail", but everything i get is blown-out and shit
 
Last edited:
Playing around with Stable Cascade some more. Square images are 1024x1024. Other images are 1024x2048 or 2048x1024. No upscaling.

1708849414327.png
evocative photograph of a statue by arno breker, the statue depicts a man, he is staring into a german landscape

1708850352452.png
1970s cartoon, conan the barbarian, posing with a sword aimed towards sky, lightning strikes the sword, dramatic composition

1708851004628.png
sturmtiger, modern photo
(obviously not a Sturmtiger, but cool nonetheless)

1708851153642.png
nurse, soviet propaganda poster

1708851725882.png
sci-fi spaceship, europan landscape, total shot
(It probably thought "Europan" was a typo of "European")

1708851882172.png
sci-fi spaceship, titan landscape, total shot

1708852093667.png
rifle on a table, ammunition, grenades, rifle cartridges, close-up shot
(Cascade is dramatically more capable of depicting guns than XL, even if pretty wonky. On XL you would get an unrecognisable pile of pipes.)

1708852458525.png
evocative impressionist depiction of the crucifixion of jesus christ, dramatic composition

For a base untweaked model this is extremely impressive, in my opinion.
 
Last edited:
If wonder what happens if you search for "japanese village" in googles woke ai. Does it give you african slums instead or does it only count if you want to generate people? What about objects and animals? Does if give you black cats if you want white cats? How far does that woke madness go?
 
If wonder what happens if you search for "japanese village" in googles woke ai. Does it give you african slums instead or does it only count if you want to generate people? What about objects and animals? Does if give you black cats if you want white cats? How far does that woke madness go?
if it's anything like the ChatGPT preprompt someone extracted it's an instruction to construct the prompt in such a way that "diverse" adjectives are randomly sprinkled whenever people are mentioned. So it should just be humans (the ice cream picture is a joke btw), but the thing is telling an LLM to do something is not going to make it want to not do that if it's inappropriate or ahistorical even if you try to instruct that caveat. That's entirely something the user should curate.
But fuck users, right. As someone who works in ML this is an industry-wide issue of hyperventilating but basically undefined "bias ethics" they're teaching in schools while not devoting any time to the higher-order implications of trying to fix vaguely assumed dataset skews with a ham fist. Like effectively deleting real women from history by inserting fake bitches in places they shouldn't be for centuries yet.

anyway,
 
If wonder what happens if you search for "japanese village" in googles woke ai. Does it give you african slums instead or does it only count if you want to generate people? What about objects and animals? Does if give you black cats if you want white cats? How far does that woke madness go?
It should give you Neo Yokio (tomblerone kino)

Anyways the real test is the "holding a sign with the words of the prompt" that let you reveal all the added shit in bing.
 
  • Like
Reactions: Richard Cheese
Playing around with Stable Cascade some more. Square images are 1024x1024. Other images are 1024x2048 or 2048x1024. No upscaling.

View attachment 5754813
evocative photograph of a statue by arno breker, the statue depicts a man, he is staring into a german landscape

View attachment 5754823
1970s cartoon, conan the barbarian, posing with a sword aimed towards sky, lightning strikes the sword, dramatic composition

View attachment 5754833
sturmtiger, modern photo
(obviously not a Sturmtiger, but cool nonetheless)

View attachment 5754834
nurse, soviet propaganda poster

View attachment 5754840
sci-fi spaceship, europan landscape, total shot
(It probably thought "Europan" was a typo of "European")

View attachment 5754844
sci-fi spaceship, titan landscape, total shot

View attachment 5754846
rifle on a table, ammunition, grenades, rifle cartridges, close-up shot
(Cascade is dramatically more capable of depicting guns than XL, even if pretty wonky. On XL you would get an unrecognisable pile of pipes.)

View attachment 5754853
evocative impressionist depiction of the crucifixion of jesus christ, dramatic composition

For a base untweaked model this is extremely impressive, in my opinion.
That stuff is great, are you running it locally?

Also runwayML is very fun for animation. I've been considering paying them for that and the 3d scanning tool but don't like spending money.
 
This isn't remotely what I asked for but I still like it.


mario1.jpg

edit: I am not convinced that Bing's shitty ancient smooth-brain AI doesn't have a god damn clue who Edouard Fournier is, or his work "The Funeral of Shelley", oh well, I present more random attempts anyway. Fucking robot, probably did even worse in Art History than I did, if such a thing is even possible...

mario2.jpgmario3.jpgmario4.jpgmario5.jpg
I do like Mario in that last one quite a bit though, might crop it to be my next avatar when I tire of the horny anime kid.

edit2: This dumb fucking thing really thinks I wanted an internet browser instead of Bowser? Why would Mario, a Witchfinder General, burn a browser at the stake? That makes no sense.
mario7.png
 
Last edited:
In February 2024, OpenAI, which we prefer to call it "CloseAI", has recently unveiled Sora, a groundbreaking text-to-video model that represents a significant leap in video generation technology. Sora has the capability to transform short text descriptions into detailed, high-definition film clips that can last up to one minute. This model advances AI technology and offers a new level of creative potential in video production.

Today, we are thrilled to launch a project called Open-Sora plan, aiming to reproduce OpenAI's video generation model, and have received unanimous anticipation from netizens both domestically and internationally.
Chinese researchers aim to replicate Open AI's Sora text-to-video capabilities... and they're calling it Open-Sora. There are some examples on the GitHub page and they aren't good. I wish them luck.
 
Back