View attachment 6131138
Fully automated moderation assistant is here!
For a while, every time a new multimodal LLM or new text embedding model has dropped, I've tried some variation of this project, and today, I finally got it working. Anthropic recently released Claude 3.5 Sonnet, topping the charts both in its general level of intelligence and, importantly, in its understanding of image content. And on top of that, it's very affordable, at less than a penny per image reviewed! This is the only model I've ever managed to get to do tasks like this which involve moderating image/text content properly. I can't express enough how frustrating it was to try to get models like GPT-4 to do a good job at this task, but Sonnet 3.5 does it almost perfectly.
It's still going to take a bit of tinkering to get it to its final state but I've let Ruben know about this and hopefully it'll get added as a tool for ModForDummies to use. His community efforts around collecting and reporting inappropriate content are awesome and especially with the stuff he's been dealing with lately I really hope this can help cut down the workload for him and his team. Get well soon king!