Hop in and vote!
View attachment 7366630
I've been hard at work on this for most of the last 2 days. I think it's ready for you to try it publicly now.
Here's a basic overview of how it works:
1. People vote on groups that are potentially bad, but haven't hit the confidence threshold for being known as bad/good yet.
2. Once a group hits the confidence threshold for being known as bad, the scraper continues scraping from that point. Groups link to users, who link to other groups, and so on.
3. The network expands, bringing into the fold a new set of groups whose bad/good status is as of yet unknown.
This is only phase one. Since this is a far more efficient collection method than my last version dozens of pages back in the thread, where I classified everything myself, I can now collect much more data about good/bad groups much faster than before. Phase two is that, once more groups are at the confidence threshold, I can train a classifier to recognize the difference between good and bad groups and users, and hopefully then expand even faster. Knowledge begets more knowledge, and by voting, you make the flywheel spin faster and faster.
Currently at just shy of 3 million users' profiles scraped. We have a long way to go with this and every vote matters. Hop in and make a couple votes if you have a few minutes. No registration needed, no personal information required, and VPN and Tor users are welcome to participate. (I have preemptively blocked Indians from participating.)
Happy to hear any feedback people have. If you encounter bugs feel free to message me or say something in the thread. I am very very excited about this project.