Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
I wouldn't go that far. For faking content, deepfaking over a video game render produces much more plausible results, and that's been doable for years now. Pay attention to the scale of the people. The woman walking before the pair the camera follows is like half their height, and at one point they pass a grown woman smaller than the left person (who for a few seconds seems to become a child himself).
There are tells. Some things feel weird, like in the beginning I was expecting the lady on the right to bump into the red uuuuh tarp I think that was? But I'm still worried about the future. It's getting easier to fake reality.
Taylor Swift won't be happy with this, I can tell you that much....
Neither should any woman. We're all in danger of being pornified now. Not me bc I'm not a public figure, but any woman whose face is out there may bump into porn of herself that she never made.
 
What I find most interesting about cascade vs. diffusion is that it seems to have a clearer and more "compartmentalized" understanding of concepts. With stable diffusion it's often very difficult to impossible to combine concepts that don't have much to do with each other. For example: putting an object (e.g. a fantasy orc) into another setting (e.g. a space station) because there's probably not a lot of example pictures of fantasy orcs in space stations in the training material. Cascade struggles a lot less with this, here have a polaroid photo of a dunmer woman at the park:
dunmer.jpg

And yes, it understood the term "dunmer", while I'm sure there's cosplay pictures in the dataset, there can't possibly be that many. It's quite impressive really. We can even push it a little farther:
argonian.jpg

That is quite impressive really. They must've trained it on a well sorted dataset. I'm sure cascade and whatever will build on it will be a winner.
 
examples-SSora
1708040772297.png
1708041490851.png





 
Last edited:
The current model has weaknesses.
Ya think? It still sucks at scale and geometry, and especially doesn't do stairs well.

The people walking along the snowy Japanese sidewalk end up walking past a tiny kiosk full of tiny humans. The man's walking speed is normal while his companion looks sped up.

The guy reading a book in the clouds has the top-right corner of the pages blow upward from the bottom-left corner. Plus he's reading and he's black, so that's already unrealistic.

In Lagos, Nigeria, as it pans around, it regards the people and clothing rack next to him on the rooftop as being on the ground, so it's all tiny compared to him.

Possibly the worst offender is the Amalfi Coast video, where the people walking toward the stairs suddenly turn toward the rail and get sucked into a black hole as they touch it, while another set of stairs below narrows until it reaches a point and doesn't actually go anywhere, and two other stairways further down don't even make geometrical sense.

In the video with the cat, the lady's right arm is appropriately stretched next to her across the bed, but somehow the hand sticking out of the sheets is also a right hand, and when she starts to turn over, what is clearly moving as her left shoulder suddenly becomes a blanket.

There's a lady under the Chinese New Year dragon clearly holding up the pole used to make the dragon dance, but it isn't even connected to the dragon.

Horribly deformed human legs in the "robot on the sidewalk" clip.

What looks like the nose of a space shuttle suddenly rolls into frame on a roof in the clip of the old man thinking about the universe.

I'm amazed that with all the recent advancement, it's still making the same mistakes.
 
examples-SSora
Should have posted the one with the cat and a woman in bed. It was interesting to see it perfectly show the cat hitting her nose, but also transform her left arm into the blanket she was under.
 
Should have posted the one with the cat and a woman in bed. It was interesting to see it perfectly show the cat hitting her nose, but also transform her left arm into the blanket she was under.
That's the best one; made me laugh.


How many legs do cats have again?
 
You people act as if this has absolutely no capability for misuse because you went in knowing what to expect and looked for mistakes. There's people out there that read on twitter that the earth is flat and some guy with enough likes saying that was enough for them to believe it. (and I had to pick this example of a conspiracy theory carefully as one of the dumbest because this forum has quite it's own base of rabid believes of various conspiracies/propaganda and I don't wanna start anything - it's that bad already) This technology has absolutely destructive and very manipualtive potential and the examples you saw weren't made to sell you a specific narrative. Also remember we're most likely still in the C64 age of this particular technological revolution. A lot more to come.

(I actually am old as shit and remember the home computer revolution and it had a ton of people that wer RABIDLY against computers and FIRMLY believed they're a fad and never be good for anything. They weren't even completely wrong, the first years they really weren't all that useful. Still, look where we are now.)
 
That's the best one; made me laugh.

View attachment 5725832
How many legs do cats have again?
I try not to nitpick these things too hard since it's incredibly impressive what it's capable of right now, but the mistakes it makes are really funny (like how the girl in this starts off as a mushmouthed downie, turns into a normal person, and then back into a mushmouthed downie when her jaw makes contact with the pillow) and sometimes surreal like this one where they magic a chair into existence. It's really neat and fun to pick apart. You have these moments where this could feasibly be mistaken for real footage of a piece of furniture being dug up and then suddenly the non-euclidean plastic chair spawns in starts moving on its own, it's really goofy.

My runner-up for my favorite is the one where the AI interprets an exploding basketball as a basketball that briefly ignites and then reproduces via budding.
 

Attachments

  • chair-archaeology.mp4
    6.8 MB
Last edited:
I'm amazed that with all the recent advancement, it's still making the same mistakes.
They are using basically the same techniques that made vague blurry smudges back in 2021, they're just scaling it up to extreme levels now, applying neat tricks to make it easier to train and run, and optimizing it so much that it can be run on home GPUs. Same foundation, same fuckups. But now it's in the time dimension too.
 
OpenAI introduced SORA (Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions.) today on their Twitter thread - archive. You can visit it here.
View attachment 5725089
Here is an example. The prompt is: "Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes."

View attachment 5725088
Imagine being a retard like keffals and paying for stock footage of mushrooms for your shitty horror channel and then this shit rolls out literally a week later. How embarrassing!
 

1) Can I run this locally
2) How much VRAM and processing power is needed to run it compared to SDXL and SD1.5
3) Is it the same level of open to training to generate anything as SDXL and SD1.5
1.yes
1708046693160.png
2.
1708046788811.png

1708047100120.png
from youtube.

3.
yes , they made some of their own trainnig.

1708046999397.png
 
I tend to believe in reality it will still have artifacts even on future models.
I would like to run something like this but my PC specs suck.

In the worst case, that it is impossible to detect artifacts, audio-visual content will not be allowed in courts and we will go back 100 years like @9gfuwegw9j9 said.
Either that or there is going to be embedded cryptographic fingerprints put on to every single camera and they'll have to upload unedited data to a tightly controlled server or something. That or maybe they'll make all the AI shit embed something so it can be detected. I do recall there was a thing with upscaling the images during the Rittenhouse trial so I really think if it's pervasive they'd have to put out a standard on it.
I try not to nitpick these things too hard since it's incredibly impressive what it's capable of right now, but the mistakes it makes are really funny (like how the girl in this starts off as a mushmouthed downie, turns into a normal person, and then back into a mushmouthed downie when her jaw makes contact with the pillow) and sometimes surreal like this one where they magic a chair into existence. It's really neat and fun to pick apart. You have these moments where this could feasibly be mistaken for real footage of a piece of furniture being dug up and then suddenly the non-euclidean plastic chair spawns in starts moving on its own, it's really goofy.
View attachment 5725905
My runner-up for my favorite is the one where the AI interprets an exploding basketball as a basketball that briefly ignites and then reproduces via budding.
Pretty surreal to watch. So much looks OK but so much looks so wrong on that chair video.
 
Back