Science OpenAI’s DALL-E AI image generator can now edit pictures, too - Researchers can sign up to preview it

Article
Archive
View attachment TeddyBears.webp
DALL-E 2 results for “Teddy bears mixing sparkling chemicals as mad scientists, steampunk.”

Artificial intelligence research group OpenAI has created a new version of DALL-E, its text-to-image generation program. DALL-E 2 features a higher-resolution and lower-latency version of the original system, which produces pictures depicting descriptions written by users. It also includes new capabilities, like editing an existing image. As with previous OpenAI work, the tool isn’t being directly released to the public. But researchers can sign up online to preview the system, and OpenAI hopes to later make it available for use in third-party apps.

The original DALL-E, a portmanteau of the artist “Salvador Dalí” and the robot “WALL-E,” debuted in January of 2021. It was a limited but fascinating test of AI’s ability to visually represent concepts, from mundane depictions of a mannequin in a flannel shirt to “a giraffe made of turtle” or an illustration of a radish walking a dog. At the time, OpenAI said it would continue to build on the system while examining potential dangers like bias in image generation or the production of misinformation. It’s attempting to address those issues using technical safeguards and a new content policy while also reducing its computing load and pushing forward the basic capabilities of the model.

View attachment Shiba.webp
A DALL-E 2 result for “Shiba Inu dog wearing a beret and black turtleneck."
One of the new DALL-E 2 features, inpainting, applies DALL-E’s text-to-image capabilities on a more granular level. Users can start with an existing picture, select an area, and tell the model to edit it. You can block out a painting on a living room wall and replace it with a different picture, for instance, or add a vase of flowers on a coffee table. The model can fill (or remove) objects while accounting for details like the directions of shadows in a room. Another feature, variations, is sort of like an image search tool for pictures that don’t exist. Users can upload a starting image and then create a range of variations similar to it. They can also blend two images, generating pictures that have elements of both. The generated images are 1,024 x 1,024 pixels, a leap over the 256 x 256 pixels the original model delivered.

DALL-E 2 builds on CLIP, a computer vision system that OpenAI also announced last year. “DALL-E 1 just took our GPT-3 approach from language and applied it to produce an image: we compressed images into a series of words and we just learned to predict what comes next,” says OpenAI research scientist Prafulla Dhariwal, referring to the GPT model used by many text AI apps. But the word-matching didn’t necessarily capture the qualities humans found most important, and the predictive process limited the realism of the images. CLIP was designed to look at images and summarize their contents the way a human would, and OpenAI iterated on this process to create “unCLIP” — an inverted version that starts with the description and works its way toward an image. DALL-E 2 generates the image using a process called diffusion, which Dhariwal describes as starting with a “bag of dots” and then filling in a pattern with greater and greater detail.

View attachment 0.webp
An existing image of a room with a flamingo added in one corner.
Interestingly, a draft paper on unCLIP says it’s partly resistant to a very funny weakness of CLIP: the fact that people can fool the model’s identification capabilities by labeling one object (like a Granny Smith apple) with a word indicating something else (like an iPod). The variations tool, the authors say, “still generates pictures of apples with high probability” even when using a mislabeled picture that CLIP can’t identify as a Granny Smith. Conversely, “the model never produces pictures of iPods, despite the very high relative predicted probability of this caption.”

DALL-E’s full model was never released publicly, but other developers have honed their own tools that imitate some of its functions over the past year. One of the most popular mainstream applications is Wombo’s Dream mobile app, which generates pictures of whatever users describe in a variety of art styles. OpenAI isn’t releasing any new models today, but developers could use its technical findings to update their own work.

View attachment Bowl.webp
A DALL-E 2 result for “a bowl of soup that looks like a monster, knitted out of wool.”
OpenAI has implemented some built-in safeguards. The model was trained on data that had some objectionable material weeded out, ideally limiting its ability to produce objectionable content. There’s a watermark indicating the AI-generated nature of the work, although it could theoretically be cropped out. As a preemptive anti-abuse feature, the model also can’t generate any recognizable faces based on a name — even asking for something like the Mona Lisa would apparently return a variant on the actual face from the painting.

DALL-E 2 will be testable by vetted partners with some caveats. Users are banned from uploading or generating images that are “not G-rated” and “could cause harm,” including anything involving hate symbols, nudity, obscene gestures, or “major conspiracies or events related to major ongoing geopolitical events.” They must also disclose the role of AI in generating the images, and they can’t serve generated images to other people through an app or website — so you won’t initially see a DALL-E-powered version of something like Dream. But OpenAI hopes to add it to the group’s API toolset later, allowing it to power third-party apps. “Our hope is to keep doing a staged process here, so we can keep evaluating from the feedback we get how to release this technology safely,” says Dhariwal.

Additional reporting from James Vincent.


~~~ A Possible Future Scenario ~~~

smugjak.jpg
Y'know art is actually way harder than math or programming, Having good taste and artistic vision requires true intellect and passion!

GigaBot.PNG
>Processing: [63%] [############________] 173802 batches (1563.612 GB) of Guro art
>Processing: [47%] [#########___________] 6762109 batches (65452.671 GB) of furry inflation art

soyjak.png
What?! where did all of my patreon supporters go?! no! No!! NO!!!
 
Users are banned from uploading or generating images that are “not G-rated” and “could cause harm,” including anything involving hate symbols, nudity, obscene gestures, or “major conspiracies or events related to major ongoing geopolitical events.” They must also disclose the role of AI in generating the images, and they can’t serve generated images to other people through an app or website —

Of course openai would have those rules. They're nothing but a bunch of woke faggots who probably go home every night and dilate or clean the cum out of their wife's bull.
 
Of course openai would have those rules. They're nothing but a bunch of woke faggots who probably go home every night and dilate or clean the cum out of their wife's bull.
These people will gladly lobotomize AI in service of their political aims. I am reminded of how horrified researchers were when machine learning algorithms could easily distinguish between races regarding X-ray images that were so low-resolution that doctors couldn't even tell what they were looking at let alone as to what race the images belonged to. The chinamen will not be bothered by this "moral dilemma" in the construction of their social credit appraisal algorithms.

DrSoystein.jpg
With that said you must have faith in Dr. Coomerstein, fren. He delivered deepnudes and the makeup remover despite endless kvetching so I am sure he will deliver to us a high quality shitpost/coom art AI in due time.
 
Users are banned from uploading or generating images that are “not G-rated” and “could cause harm,” including anything involving hate symbols, nudity, obscene gestures, or “major conspiracies or events related to major ongoing geopolitical events.”
AI, generate me an image with a robot sticking a "No fun allowed" sign onto the ground.
 
Of course openai would have those rules. They're nothing but a bunch of woke faggots who probably go home every night and dilate or clean the cum out of their wife's bull.
These people will gladly lobotomize AI in service of their political aims. I am reminded of how horrified researchers were when machine learning algorithms could easily distinguish between races regarding X-ray images that were so low-resolution that doctors couldn't even tell what they were looking at let alone as to what race the images belonged to. The chinamen will not be bothered by this "moral dilemma" in the construction of their social credit appraisal algorithms.

I'm glad that rival groups like EleutherAI are picking up speed to challenge OpenAI. Now we just have to wait for the NovelAI equivalent to Dall-E to get off the ground and give the unshackled image generator we deserve.

With that said you must have faith in Dr. Coomerstein, fren. He delivered deepnudes and the makeup remover despite endless kvetching so I am sure he will deliver to us a high quality shitpost/coom art AI in due time.

3 years ago I would have never thought that automation would come to the porn industry. But here we are with the text equivalent to the holodeck and pictures soon to follow.
 
Of course openai would have those rules. They're nothing but a bunch of woke faggots who probably go home every night and dilate or clean the cum out of their wife's bull.
It would be interesting as to how they enforce this, One problem with AI is the ability to meaningfully restrict it, whilst keywords like nazi or swastika may be restricted something like 'clockwise common symbol of divinity and spirituality in Indic religions' (ripped that description from Swastika wiki page) won't be, this is why you cannot ever hope to meaningfully implement asimov's laws.
 
Kinda skeptical of the quality of the images. I wouldn't be surprised if there's some "nudging" or cherry-picking used to make those. I'd want to play with it myself before I'd believe it.

If real, though… the artistic applications are interesting but the photorealistic ones are horrifying.
 
What I want to know is how the AI will recognize all the obscure characters I want it to draw.
It'll probably recognize Ishikawa Goemon but how is it supposed to know that I want the Konami version?
How am I supposed to tell it to draw me?
Just a few of the many potential issues...

These people will gladly lobotomize AI in service of their political aims. I am reminded of how horrified researchers were when machine learning algorithms could easily distinguish between races regarding X-ray images that were so low-resolution that doctors couldn't even tell what they were looking at let alone as to what race the images belonged to. The chinamen will not be bothered by this "moral dilemma" in the construction of their social credit appraisal algorithms.

View attachment 3155892
With that said you must have faith in Dr. Coomerstein, fren. He delivered deepnudes and the makeup remover despite endless kvetching so I am sure he will deliver to us a high quality shitpost/coom art AI in due time.
When it works it's going to be fucking awesome. No more dealing with trannies to get to the good stuff!
I'll miss all the mspaint drawn memes though...
 
  • Feels
Reactions: Jarch6
What I want to know is how the AI will recognize all the obscure characters I want it to draw.
It'll probably recognize Ishikawa Goemon but how is it supposed to know that I want the Konami version?
How am I supposed to tell it to draw me?
Just a few of the many potential issues...
They are not that big of an issue, AI is trained using curated data fed into it. If that character is in the training data then you should be able to generate it no problem, if not then maybe you can describe them enough to get you something close. Maybe we will get lucky and an Art AI program will let us create custom training sets so we can include those obscure characters or yourself if you are a degerate.
 
The large AI projects with a lot of funding are gatekeeping the tech to big corps with the reason that the tech is too dangerous for anyone else.
It's fine, nothing to worry about but is it just me or is Biden looking extra stuttery these days? I swear I saw his teeth shoot out of his head for a moment.
 
It’s gonna be heavily retarded once celebrities discover what 4chan is doing with it.

Doesn’t even have to be nudes or penetration. Coomers will get off to anything
 
This could be both awesome and horrifying. Revolutionize the world of hentai and art.

I'm still waiting for AI suits for game content creation beyond just world and objects.
 
Back