Discord, big data, and pedophiles - The application of data science to Discord degeneracy

This is the same Discord who a few years ago had a furry "moderator corruption" problem so it wouldn't be to surprising if it was found out there are some people complicit in keeping awful things around.
That article says Discord had 19 million users in 2019. That figure was 300 million at of the start of this year.
Makes you realize both the scale of the problem and how unlikely it is that Discord LLC is capable of actively monitoring to a sufficient level. They're already struggling to turn a profit with the service, you can almost guarantee they're asleep at the wheel as far as staying on top of this shit.
 
Jesus man, I'm running out of trophies, birds, and fruit:

Kiwi Hero 2.jpeg
 
That article says Discord had 19 million users in 2019. That figure was 300 million at of the start of this year.
Makes you realize both the scale of the problem and how plausible it is that Discord LLC is capable of actively monitoring to a sufficient level. They're already struggling to turn a profit with the service, you can almost guarantee they're asleep at the wheel as far as staying on top of this shit.
Discord will never be profitable.

Internet-based startups with a generous free service have always struggled to be profitable with their main product. The only way you'll see internet services like Discord, YouTube, Spotify, Twitch and Uber actually "profit" is through secondary investments in other services that are profitable, stock-market plays or shifting money around (cooking the books) to make things look better.

Anything that requires process intensive operations (WebRTC/audio/video streaming is extremely costly) absolutely blows out costs. I've done some work for clients that have required WebRTC stuff before and it's mind boggling how expensive things get at scale.

MicroStrategy is a great example of this. They sell products and services, but were making a fucking fortune off Bitcoin investing (the classic Michael Saylor story).
 
Last edited:
That article says Discord had 19 million users in 2019. That figure was 300 million at of the start of this year.
Makes you realize both the scale of the problem and how unlikely it is that Discord LLC is capable of actively monitoring to a sufficient level. They're already struggling to turn a profit with the service, you can almost guarantee they're asleep at the wheel as far as staying on top of this shit.
The numbers are quite staggering. Apparently Discord Inc. as of late 2022 only had something like 600 employees globally so one would have to think they've outsourced a lot of the moderating efforts to individual servers and hope they can do a good job themselves, it's either that or tons of automated processes. There could be a combination of both but either way really even if the company is extremely vigilant and was quick to remove material that shouldn't be on the platform, some terrible stuff must inevitably statistically speaking slip through the cracks from time to time unfortunately.
 
Last edited:
I wouldn't be surprised at all to see one of the big 'household name' tech ppl to be arrested within the next year with something so horrific that it will change how we'll view the whole internet.
Well... Bill gates has recently (as in like a few years) has been exposed as visiting Epstien's Island several times with each explanation and story changing.
 
The scale is quite staggering. Apparently Discord Inc. as of late 2022 only had something like 600 employees globally so one would have to think they've outsourced a lot of the moderating efforts to individual servers and hope they can do a good job themselves, it's either that or tons of automated processes. There could be a combination of both but either way really even if the company does have some secret bad actors within it internally or not some terrible stuff must inevitably statistically speaking slip through the cracks unfortunately.
What worries me here is that at the global and server level, moderation is probably successful at controlling the worst and most overt stuff - Sharing of CSAM, open child abuse, flagrant violations of law in it's many forms (e.g. organizing drug buys).
But entirely ineffective at controlling anything even slightly subtler than that, and the sheer scope of harm there is staggering. Worse, it's down to finger pointing. Can't afford the resources to patrol this at the global level, so it's devolved to the local level. Where of course it may or may not be present. We're just never gonna check and assume everything is fine. Except it's very apparent to even a casual observer that things are not fine, and there is a serious, growing problem here. Plus it's been going on, and getting worse, for years now. Short of OP's amazing work, this kind of discovery simply isn't possible by anyone other than Discord LLC, who as I already said are simply asleep at the wheel, and any change in this space would just harm their already unprofitable earnings.
Discord will never be profitable.

Internet-based startups with a generous free service have always struggled to be profitable with their main product. The only way you'll see internet services like Discord, YouTube, Spotify, Twitch and Uber actually "profit" is through secondary investments in other services that are profitable, stock-market plays or shifting money around (cooking the books) to make things look better.

Anything that requires process intensive operations (WebRTC/audio/video streaming is extremely costly) absolutely blows out costs. I've done some work for clients that have required WebRTC stuff before and it's mind boggling how expensive things get at scale.

MicroStrategy is a great example of this. They sell products and services, but were making a fucking fortune off Bitcoin investing (the classic Michael Saylor story).
And therein lies a large part of the problem - You have a service provider that's already not making money despite only performing the bare minimum of oversight. It's against their interests to even acknowledge that this problem exists. They're going to keep tacitly allowing the problem to worsen, and increasingly so as they struggle to remain afloat. Kinda like how Uber et al seemed at first to undercut traditional taxi services with no downsides, until it soon came to light that all the shit they were omitting in the name of running a leaner service meant various safety measures, like vetting out felons so lone women don't get raped at 2am in some random 'cab', were part of the fat that was cut.
 
The numbers are quite staggering. Apparently Discord Inc. as of late 2022 only had something like 600 employees globally so one would have to think they've outsourced a lot of the moderating efforts to individual servers and hope they can do a good job themselves, it's either that or tons of automated processes. There could be a combination of both but either way really even if the company is extremely vigilant and was quick to remove material that shouldn't be on the platform, some terrible stuff must inevitably statistically speaking slip through the cracks from time to time unfortunately.

What's even scarier is that until recently there was no in-platform report functionality, either for standard users or admins/mods of servers.

You still have to paste message UUIDs and User UUIDs which, by default, require the individual to go in to developer mode.

For 7 years Discord had no effective reporting system -- if someone were to find evidence of such things or experience such things then all that would be required for someone to prevent any report would be to kick them from the server and therefore denying them of access to the UUIDs.

Disord unlike other platforms has zero user control over any data if they leave a server or have their account deleted. That means that someone could exploit a child in to posting images of themselves in a server, kick them, and the poster will have zero ability to remove that material or remove their association with it. It's actually very likely this breaches European GDPR however as far as I'm aware this has never been challenged in court.

To give you an idea of reports I've sent over the years...

I'd say roughly I've sent in about 50 comprehensive reports to Discord over 7 years.

These reports were for shit relating to active child groomers, loli and shota servers, and zoo shit.

Despite UUIDs, despite a list of usernames, despite evidence, despite message UUIDs, despite it breaching every aspect of TOS and most laws in the western world... about 10 of them resulted in wide scale bans, 4 more only resulted in the admins of those servers having their accounts banned, and nothing with the rest. Of the 14/50 successful reports, from that 3 instances of users coming back with a VPN to evade ban, admitting to it, having that reported, and zero moderation against them.
 
@grand larsony Thanks for the answer. Appreciate it.

Green - seems to be social club type servers without a strong common theme in the naming. Lots of mentions of "club" "cafe" "hangout" etc
I can maybe help you out there - I remember when that stuff started.

That was after I left Discord but I got asked about them so I did a bit of research - you'll find that many of those started as bragging groups for owners/promoters of trash tier NFTs (those trying to  ape the Bored Ape morons' shtick but unable to manage making a website so doing it on Discord). I'm surprised that they're still around, TBH.

I don't know if you classified shit like "Pokimane appreciation club" in this category or "personal servers"- those simp servers are completely cancerous - no age limit to entry, full of coomer shit (sex talk, porn and loli, and even bullshit like ERP channels without age screening).

It used to be that Discord was nothing but gaming stuff (in fact, when you downloaded it the program description was "gamers' chat"). I'm disgusted to see from your server grouping diagram that this is now a small fringe of what the program is used for. I'm also interested in how this happened - whether it was organic or if the company made a conscious attempt to "widen the userbase by broadening the appeal". Anyone know?

@grand larsony , I wish you luck in monetising this bot of yours. You deserve it.
 
And therein lies a large part of the problem - You have a service provider that's already not making money despite only performing the bare minimum of oversight. It's against their interests to even acknowledge that this problem exists. They're going to keep tacitly allowing the problem to worsen, and increasingly so as they struggle to remain afloat. Kinda like how Uber et al seemed at first to undercut traditional taxi services with no downsides, until it soon came to light that all the shit they were omitting in the name of running a leaner service meant various safety measures, like vetting out felons so lone women don't get raped at 2am in some random 'cab', were part of the fat that was cut.
This reminds me of Kik. I'm not sure how old people are here but if you're around my age you probably remember Kik blowing up when you were a teenager. If anyone reading doesn't remember it, it filled kinda the same role as Discord, a simple free messenger that had features that made it convenient to use.
I'm paraphrasing from this Darknet Diaries episode but tl;dr what happened after we all outgrew it and switched to other platforms was that they struggled to maintain profitability, so the company got sold. The new company that bought it specializes in buying dying chat platforms, stuffing them full of ads, stripping down their functionality to reduce cost, and cutting all the employees out so that the service is functionally a zombie of its former self. As part of that process with Kik they fired all the safety staff. The grooming exploded and now the entire app basically exists as a child porn and grooming service that nobody outside of those purposes really uses for anything.
Since the staff are basically nonexistent, outside of court orders, there's very little that can be done with fucked up disgusting content on there. You can report things, but nobody reads the reports. This is basically the worst case scenario for any service that children use. The service dying would be fine, but the service continuing to stay up with even less moderation than before is far worse.
It used to be that Discord was nothing but gaming stuff (in fact, when you downloaded it the program description was "gamers' chat"). I'm disgusted to see from your server grouping diagram that this is now a small fringe of what the program is used for. I'm also interested in how this happened - whether it was organic or if the company made a conscious attempt to "widen the userbase by broadening the appeal". Anyone know?
Kind of an old article - https://www.cnbc.com/2021/05/08/wha...-fosters-community-expands-beyond-gaming.html
They've been trying to expand beyond gaming for a while. I think the transition is natural. Something they encouraged, but something that would've happened anyway, sooner or later. If it's a good chat app for gamers, it's probably just a good chat app in general. Their branding at the start probably helped them to grow but now that they're a household name, there's no reason to constrain themselves to only being for gamers.
As for the classification I did in the previous scatter plot, it was done automatically. The cluster you're referring to was full of stuff like (making up fake but similar names here) "Brandon's Movie Theater" or "Carly's Cool Club". Nothing NFT-related that I could glean from the names, seemed to mostly be people who just wanted general chat with like-minded individuals. Loneliness crisis and all that.
Thanks for the kind words. Hopefully at some point I'll be able to come back here and post another network graph but with millions of nodes instead of thousands!
 
I just hope OP puts this stuff together as easy to understand, objective/without bias and simple as possible once he collects all the data so it's all nice and bulletproof.
People asking to look to correlate things, it's usually how people will poke holes and say "if you're looking to find a pattern you can put any info to find it."
It'd just be great to have all the t's crossed and i's dotted about Discord's swamp.
 
The numbers are quite staggering. Apparently Discord Inc. as of late 2022 only had something like 600 employees globally so one would have to think they've outsourced a lot of the moderating efforts to individual servers and hope they can do a good job themselves, it's either that or tons of automated processes.

What makes you think they want to moderate? Twitter/X has been hit with boycotts from various companies after Musk took over and CP was largely quashed, and yet Meta and Instagram are fine.

insta.jpg

It's like everyone forgot the Epstein client list was sealed by a US Judge. And Biden flew out to the Canary Islands to fire the Attorney General who attempted to investigate further.
 
This would be an objectively terrible idea if you want to see this data go towards anything remotely useful or meaningful.

Crime stats already give you the answer to that question anyways.
I'm more interested in where the troon gooners are interacting outside of their gross cumservers. Either way, this is really informative and cool to see.
 
Last edited:
  • Like
Reactions: Carpenter Trout
got bored of fucking goats
0.96359253
Honestly a few of these don't sound like zoophilia, but just edgy comments out of context. This one could easily about somebody talking about members of the religion of peace, for example. I'd not rely on anything llama powered to understand context and subtlety too well, to be entirely honest with you. Not yet, anyways - and purple llama is really, really new. I'm not saying that there isn't ton of sex pests on discord, but I'm not sure I'd trust data like this fully yet. Even just a tiny bias towards false positives can skew data like this terribly.
 
It keeps falling into local minima where it decides the best course of action is to simply pick a low score for every piece of text, since that still gets it >99% correct answers.
You can try scoring the outcomes asymmetrically, for example punishing a false negative more than a false positive, to reflect the fact positives are rare but "critical". Sorta how you train models on detecting cancer from X-rays.
 
Of course, I'm wondering where else within Discord.
Oh right, I would suspect group chats within the direct messaging space, so that it's easy to delete from your client quickly if need be. Not sure about Discord's data retention on those groups, I'd imagine deleting a group doesn't actually drop any rows from their system.
 
I think a good alternative end goal for your data if you don't try to go the academic publishing route would to build a clearnet website and make a live interactive graph of the data. Something where normies could see each cluster of server types sorted by interest (gaming, Chinese language servers, personal servers, other server groups) and flip through different layers showing how those different intrest types often have messages associated with or containing CSAM messages, zoosadisim, troonisms ect. Highlighting how many"power users" there are and the overlap.
 
Back