Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

The Mass Shooter Ron Soye · Sep 21, 2023

https://arstechnica.com/information-technology/2023/09/openai-announces-dall-e-3-a-next-gen-ai-image-generator-based-on-chatgpt/

https://archive.ph/N9ul4

OpenAI’s new AI image generator pushes the limits in detail and prompt fidelity

With better response to details and text, DALL-E 3 hopes to make prompt engineering obsolete.

Looks like a major escalation.


An illustration of an avocado sitting in a therapist's chair, saying 'I just feel so empty inside' with a pit-sized hole in its center. The therapist, a spoon, scribbles notes.	A vast landscape made entirely of various meats spreads out before the viewer. tender, succulent hills of roast beef, chicken drumstick trees, bacon rivers, and ham boulders create a surreal, yet appetizing scene. the sky is adorned with pepperoni sun and salami clouds.

A minimap diorama of a cafe adorned with indoor plants. Wooden beams crisscross above, and a cold brew station stands out with tiny bottles and glasses.	Close-up photograph of a hermit crab nestled in wet sand, with sea foam nearby and the details of its shell and texture of the sand accentuated.

A paper craft art depicting a girl giving her cat a gentle hug. Both sit amidst potted plants, with the cat purring contentedly while the girl smiles. The scene is adorned with handcrafted paper flowers and leaves.	Pixel art scene of Coit Tower standing tall on Telegraph Hill, with a panoramic view of the city below and birds flying around.

Tiny potato kings wearing majestic crowns, sitting on thrones, overseeing their vast potato kingdom filled with potato subjects and potato castles.	An illustration of a human heart made of translucent glass, standing on a pedestal amidst a stormy sea. Rays of sunlight pierce the clouds, illuminating the heart, revealing a tiny universe within. The quote 'Find the universe within you' is etched in bold letters across the horizon.

A middle-aged woman of Asian descent, her dark hair streaked with silver, appears fractured and splintered, intricately embedded within a sea of broken porcelain. The porcelain glistens with splatter paint patterns in a harmonious blend of glossy and matte blues, greens, oranges, and reds, capturing her dance in a surreal juxtaposition of movement and stillness. Her skin tone, a light hue like the porcelain, adds an almost mystical quality to her form.	A 3D render of a coffee mug placed on a window sill during a stormy day. The storm outside the window is reflected in the coffee, with miniature lightning bolts and turbulent waves seen inside the mug. The room is dimly lit, adding to the dramatic atmosphere.

A comparison of "An expressive oil painting of a basketball player dunking, depicted as an explosion of a nebula" as generated by DALL-E 2 (left) and DALL-E 3 (right).

You can see some problems with DALL-E 3 following instructions in the meat landscape and potato king prompts.

Baraadmirer · Sep 21, 2023

The Mass Shooter Ron Soye said:
New audio AI dropped, infinite lo-fi hip hop AI hell is now very plausible:

https://stability.ai/research/stable-audio-efficient-timing-latent-diffusion

Check "Prompt: lofi hip hop beat melodic chillhop 85 BPM" (it's a .M4A)

https://stableaudio.com/

Trance, Ibiza, Beach, Sun, 4 AM, Progressive, Synthesizer, 909, Dramatic Chords, Choir, Euphoric, Nostalgic, Dynamic, Flowing
trance-ibiza-beach-sun-4-am-progressive-synthesizer-909-dramatic-chords-choir-euphoric-nostalg...mp3

Synthpop, Big Reverbed Synthesizer Pad Chords, Driving Gated Drum Machine, Atmospheric, Moody, Nostalgic, Cool, Club, Striped-back, Pop Instrumental, 100 BPM
synthpop-big-reverbed-synthesizer-pad-chords-driving-gated-drum-machine-atmospheric-moody-nost...mp3

https://arstechnica.com/information-technology/2023/09/ai-can-now-generate-cd-quality-music-from-text-and-its-only-getting-better/

https://archive.ph/r7gBf

Feels like it's got a ways to go with certain genres, but interesting to see where it goes. I tried

Code:

JRPG orchestral battle music, piano, energetic, trumpets, drums, triangle

and got this:

Mr.Miyagi · Sep 21, 2023

The Mass Shooter Ron Soye said:
https://arstechnica.com/information-technology/2023/09/openai-announces-dall-e-3-a-next-gen-ai-image-generator-based-on-chatgpt/ https://archive.ph/N9ul4

As a nod to these controversies, OpenAI says that DALL-E 3 is designed to decline requests that ask for an image in the style of a living artist. OpenAI also provides a form where creators can opt out of having their images used to train future models. It seems unlikely that these measures will satisfy artists who typically think AI training should be opt-in only rather than included in image data sets by default.

It'll be interesting to see how people work on creating possible workarounds/jailbreaks of this, if possible. I don't really believe OpenAI will be stripping out any part of the dataset that could help improve the model's efficacy, so these steps feel more like an attempt at legal CYA measures.

macrodegenerate · Sep 21, 2023

If DALLE is closed source. There's not a point. SDXL is amazing for LoRAs and LoRAs sell stable diffusion.

The Mass Shooter Ron Soye · Sep 21, 2023

macrodegenerate said:
If DALLE is closed source. There's not a point. SDXL is amazing for LoRAs and LoRAs sell stable diffusion.

If a normie can do more useful work with an OpenAI subscription than messing around with open source models, then there's definitely a point. It also gives an idea of where things are heading for competing models.

Roland TB-303 · Sep 21, 2023

Mr.Miyagi said:
It'll be interesting to see how people work on creating possible workarounds/jailbreaks of this, if possible. I don't really believe OpenAI will be stripping out any part of the dataset that could help improve the model's efficacy, so these steps feel more like an attempt at legal CYA measures.

OpenAI has proven again and again that they will neuter and ruin their own products for the sake of “safety”. The article even mentions they have more safeguards in place for DALL-E 3.

macrodegenerate · Sep 21, 2023

Homofascism said:
OpenAI has proven again and again that they will neuter and ruin their own products for the sake of “safety”. The article even mentions they have more safeguards in place for DALL-E 3.

And here's the thing, I think the vast amount of image generation is done by coomers, furries and people copying styles to resell. So mostly consumers.
DALL-E 3 isn't a viable product to maximize profits if it can't hit those market segments.
Plus those market segments provide massive leaps forward in the tech in the span of months or weeks. The efficiency gains in the training time between Dreambooth, Hypernetworks and LoRAs is insane. That's just 6 months for completely different iterative approaches to improve on fine-tuning accuracy.

AmpleApricots · Sep 27, 2023

and yet another contender has arrived. https://mistral.ai frenchmen and ex-llama people released an 7b model. Notably, from benchmarks it beats 13b and code 34b llama 2 models. The general consensus seems to be that it's smart for a 7b model. Trained on a context size of 8k (vs. 4k llama 2) the team announces 13b and 34b for this fall. I feel this might be one to look out for.

There was also a chinese model released, Qwen 14b. Nothing special really except it really likes to play up Xi Jinping.

Irrational Exuberance · Sep 27, 2023

Baraadmirer said:
Feels like it's got a ways to go with certain genres, but interesting to see where it goes. I tried

Code:

JRPG orchestral battle music, piano, energetic, trumpets, drums, triangle

and got this:
jrpg-orchestral-battle-music-piano-energetic-trumpets-drums-triangle_092123.mp3

I could see that as the BGM of the realm of a mad god; the sky is a discordant mishmash of jagged colors which dart crazily from one end to the other, while abominations of nature scream and flail about - and it's impossible to tell whether they even intend to hurt you or not.

inception_state · Sep 27, 2023

Normie Twitter has discovered controlnet and is having fun hiding politically incorrect messages in pastoral scenes. Eg. here are two examples I found:

https://twitter.com/mayfer/status/1706912930355392756

https://twitter.com/JobTheBaptist/status/1706119697756807189

And here's a sneed castle I made using the same method.

13065-2079610094-a castle on a hill with a river running through it and boats in the water bel...png

13066-2079610095-a castle on a hill with a river running through it and boats in the water bel...png

macrodegenerate said:
And here's the thing, I think the vast amount of image generation is done by coomers, furries and people copying styles to resell. So mostly consumers.

This is 100% accurate. Look at at the most popular models and LoRAs on Civitai. It's a mix of NSFW, furry shit, and borderline copyright infringement (recognizable characters, people, etc). Also, making DALL-E 3 unable to mimic styles of particular artists is going to be a problem for anyone who wants to use assets in a game, book, or some other context where they need stylistic consistency.

The Mass Shooter Ron Soye · Sep 27, 2023

Irrational Exuberance said:
I could see that as the BGM of the realm of a mad god; the sky is a discordant mishmash of jagged colors which dart crazily from one end to the other, while abominations of nature scream and flail about - and it's impossible to tell whether they even intend to hurt you or not.

Stable Audio isn't perfect, but it's a lot better than that crap that appeared a few months ago. What was it called... I think it was Riffusion. It appears to be on a trajectory that could take music generation to where image generation is now in the future. And of course, lazy fucks can use GPT to create verbose text descriptions to be fed into the music generators.

macrodegenerate · Sep 28, 2023

inception_state said:
This is 100% accurate. Look at at the most popular models and LoRAs on Civitai. It's a mix of NSFW, furry shit, and borderline copyright infringement (recognizable characters, people, etc).

Holy shit you were not kidding. I've never used Civitai because I don't release my models, but wow that's a lot of porn. Also found this:

verymuchawful · Sep 30, 2023

Messing with DALLE-3.

A cozy image of a Kiwi bird next to an upright kiwi fruit, digital photography, next to text that reads "TTD" with a heart next to the "D"

AmpleApricots · Sep 30, 2023

are they still doing that stuff that when you write "people", "man" or "woman" it secretly inserts "black" into the prompt?

inception_state · Sep 30, 2023

verymuchawful said:
Messing with DALLE-3.
View attachment 5373007

Its ability to insert human-readable text and understand prompts without "prompt engineering" seems pretty impressive. Have they even published a paper for DALLE-3 yet though? Seriously, fuck OpenAI.

Sulla · Sep 30, 2023

In theory, couldnt you just ask whatever text ai to convert your description to a prompt, making D-3 immediately obsolete?

inception_state · Sep 30, 2023

Sulla said:
In theory, couldnt you just ask whatever text ai to convert your description to a prompt, making D-3 immediately obsolete?

It's possible they are doing that, taking the user's prompt, applying a text->text model to turn the natural language prompt into a more machine interpretable prompt, and then using that as the input for their text->image model. Still though, that does not explain their ability to insert text directly into the image and compose multiple objects/concepts in such a reliable way.

The Mass Shooter Ron Soye · Sep 30, 2023

Sulla said:
In theory, couldnt you just ask whatever text ai to convert your description to a prompt, making D-3 immediately obsolete?

inception_state said:
It's possible they are doing that, taking the user's prompt, applying a text->text model to turn the natural language prompt into a more machine interpretable prompt, and then using that as the input for their text->image model. Still though, that does not explain their ability to insert text directly into the image and compose multiple objects/concepts in such a reliable way.

Not sure what's happening internally in DALL-E 3, but they say it's "built on ChatGPT" and they will be encouraging the use of ChatGPT to make prompts:

DALL·E 3 is built natively on ChatGPT, which lets you use ChatGPT as a brainstorming partner and refiner of your prompts. Just ask ChatGPT what you want to see in anything from a simple sentence to a detailed paragraph.

When prompted with an idea, ChatGPT will automatically generate tailored, detailed prompts for DALL·E 3 that bring your idea to life. If you like a particular image, but it’s not quite right, you can ask ChatGPT to make tweaks with just a few words.

DALL·E 3 will be available to ChatGPT Plus and Enterprise customers in early October.

Colon capital V · Oct 1, 2023

I know it's a humorous scene to behold, but when I first saw this I was legitimately taken aback and realized we pretty much are on the cusp of AI capable of generating anything with convincing results. Fucking wild it's managed to get a grasp on this sorta "low-poly, in-game, analog TV set" look.

Neo Kiwi · Oct 1, 2023

Chinese Obama

Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

OpenAI’s new AI image generator pushes the limits in detail and prompt fidelity

With better response to details and text, DALL-E 3 hopes to make prompt engineering obsolete.

Baraadmirer

💪🌸💪

Mr.Miyagi

macrodegenerate

Generative AI was a mistake

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

Roland TB-303

Acid Generator

macrodegenerate

Generative AI was a mistake

AmpleApricots

Irrational Exuberance

SPEND! SPEND! SPEND!

inception_state

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

macrodegenerate

Generative AI was a mistake

verymuchawful

Enjoy prison, sticker child.

AmpleApricots

inception_state

Sulla

Speak not the Watchers

inception_state

The Mass Shooter Ron Soye

Shit nobody cares about Expert ✅

Colon capital V

Loudest, biggest, most nuclear-size Brap above me

Neo Kiwi

Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

Shit nobody cares about Expert ✅

OpenAI’s new AI image generator pushes the limits in detail and prompt fidelity​

With better response to details and text, DALL-E 3 hopes to make prompt engineering obsolete.​

💪🌸💪

Generative AI was a mistake

Shit nobody cares about Expert ✅

Acid Generator

Generative AI was a mistake

SPEND! SPEND! SPEND!

Shit nobody cares about Expert ✅

Generative AI was a mistake

Enjoy prison, sticker child.

Speak not the Watchers

Shit nobody cares about Expert ✅

Loudest, biggest, most nuclear-size Brap above me

OpenAI’s new AI image generator pushes the limits in detail and prompt fidelity

With better response to details and text, DALL-E 3 hopes to make prompt engineering obsolete.