Stable Diffusion, NovelAI, Machine Learning Art - AI art generation discussion and image dump

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
I have some sketches I'm drawing for an upcoming project, schematics etc. What's the best option for feeding sketches into a model and having it spit out variations of a final product based on the sketch. I tried ChatGPT but it seems pretty far behind as far as image generation.
Are you trying to train a particular subject or a concept?
 
I have some sketches I'm drawing for an upcoming project, schematics etc. What's the best option for feeding sketches into a model and having it spit out variations of a final product based on the sketch. I tried ChatGPT but it seems pretty far behind as far as image generation.
Sounds like you want controlnet:
controlnet_sketch.png
 
Are you trying to train a particular subject or a concept?
I just want it to do the thing, I don't have a bunch of data to train it I just want to use a model to get more ideas from my sketches. The ChatGPT model will only capture a few details of the sketch and hallucinate the rest, I think it does image->text description->image instead of image->image
That looks right, I'll try it thank you
 
you should be able to make your own LORA for stable diffusion or flux. I've not done it, but there are guides like this that might be of use.
slightly off topic but i can never tell what accent this guy has. I've always assumed he's a pejeet just because why wouldn't he be, but the more i listen the less convinced i am.
 
Nonprofit scrubs illegal content from controversial AI training dataset (archive)
Releasing Re-LAION 5B: transparent iteration on LAION-5B with additional safety fixes (archive)
In all, 2236 links were removed after matching with the lists of link and image hashes provided by our partners. These links also subsume 1008 links found by the Stanford Internet Observatory report in Dec 2023. Note: A substantial fraction of these links known to IWF and C3P are most likely dead (as organizations make continual efforts to take the known material down from public web), therefore this number is an upper bound for links leading to potential CSAM.
A couple thousand potential images (that may have been unavailable when models were trained on this dataset) does not really change anything, but it's worth noting that it's back.
 
  • Informative
Reactions: anustart76
Nonprofit scrubs illegal content from controversial AI training dataset (archive)
Releasing Re-LAION 5B: transparent iteration on LAION-5B with additional safety fixes (archive)

A couple thousand potential images (that may have been unavailable when models were trained on this dataset) does not really change anything, but it's worth noting that it's back.
Make sure to save what you want, while you can

Edit: OK I'm guilty of not reading the article... But point still stands regarding models in general imo
 
Last edited:
Nonprofit scrubs illegal content from controversial AI training dataset (archive)
Releasing Re-LAION 5B: transparent iteration on LAION-5B with additional safety fixes (archive)

A couple thousand potential images (that may have been unavailable when models were trained on this dataset) does not really change anything, but it's worth noting that it's back.
Fun fact, all of the tools used to actually scan for and detect child pornography are paywalled even though the training data used to develop them are provided free to their producers by tax funded law enforcement agencies. The only exception is Project Arachnid which is closed source. There is nothing stopping the federal government from developing a list of fuzzy hashes and allowing the general public free access. The price of compute is a drop in the bucket for the FBI's budget. It's just more profitable for nonprofits if the distribution of CSAM is easier.
 
Fun fact, all of the tools used to actually scan for and detect child pornography are paywalled even though the training data used to develop them are provided free to their producers by tax funded law enforcement agencies. The only exception is Project Arachnid which is closed source. There is nothing stopping the federal government from developing a list of fuzzy hashes and allowing the general public free access. The price of compute is a drop in the bucket for the FBI's budget. It's just more profitable for nonprofits if the distribution of CSAM is easier.
We'll see a shift to AI-based detection replacing or complementing hashes, to target any newly created material that hasn't been cataloged and hashed. Open source decentralized networks could use client-side filters to detect and hide potentially illegal content. Platforms like Facebook are already using AI to detect "misinformation" and "hate speech", and can find nudity and copyrighted music in live streams. They are surely working on detecting CSAM, mass shootings, etc. in real time. Acting on it automatically without causing headaches requires getting the false positives down.

Without even training with CSAM, you could probably combine simple body detection (rectangle around the whole person) with these for a half-assed solution:
These are just commercial services I searched up in seconds. There may be open alternatives.

Back to LAION, it was much ado about nothing since the up to 2,236 images trained on have little effect on big image models, and the virtual CSAM is being done with fine-tuning techniques. But LAION-5B was taken down in December and is finally back as Re-LAION 5B.
 
  • Like
Reactions: anustart76 and Vecr
Anyone know of or suggest how it would be done, a way of removing overlaid text on short videos. Case in point the video I attach to this post. The universal tendencies to stick flashing transcription over what someone is saying is to me staggeringly annoying. For a start, I don't need it. But worse, it distracts me from actually seeing people's facial expressions or other detail. All I'm doing is reading flashing text. It's to the point that if I see a video has this I just don't watch it at all now.

The ideal would be to magically erase the text by filling in an approximation of what is behind it either from its surroundings or from when the text is briefly not displayed. The second best would be to stick an actual black bar over where the text is for the entire video. That would be ugly but still make the video watchable to me.

Bonus if there were a way to eliminate random zooms in and out. Again, the attached sample video is a good demonstration of this. I don't mean when it goes back and forth between two people to one, I mean when it's focused on one person but it still keeps jumping in and out.

I truly hate this and not only aesthetically. I believe it affects the processing of the information you receive and the amount of it.

Would love to develop some plugin that could just read in a video like this and output the fixed version.
 
AI-based detection
it will never not be amusing to me how scifi kept telling us for almost a century that computer intelligence will, if capable of human language at all, be very literal and autistic and have trouble understanding subtext, hidden meaning and nuance while in reality, some LLMs are more capable of it than some people I know. The potential for completely controlling the information flow on the internet is enormous. As far as I'm concerned, the internet is dead.
 
it will never not be amusing to me how scifi kept telling us for almost a century that computer intelligence will, if capable of human language at all, be very literal and autistic and have trouble understanding subtext, hidden meaning and nuance while in reality, some LLMs are more capable of it than some people I know. The potential for completely controlling the information flow on the internet is enormous. As far as I'm concerned, the internet is dead.
 
Back