This approach uses AI to flag inappropriate content the moment it’s created and prevent it from ever surfacing. Post-moderation scales better, but forces your users to consume potentially offensive or disturbing media.
You see this almost everywhere (YouTube, Facebook, Instagram, and many more).īoth of these approaches clearly come with drawbacks–pre-moderation requires a large human moderation team, and doesn’t work for real-time applications (chat, or any type of streaming). Instead, the job of flagging posts usually gets crowdsourced to users, who are able to “flag” or “report” content they believe violates a site’s TOS.
Don’t worry if you’ve never done any machine learning before–we’ll use the Perspective API, a completely free tool from Google, to handle the complicated bits.īut, before we get into the tech-y details, let’s talk about some high-level moderation strategies.
In this post, I’ll show you how to build your own AI-powered moderation bot for the chat platform Discord. For these applications, machine learning can really help. Plus, since most apps aren’t public forums like Facebook or Twitter (where we have strong expectations of free speech), the consequences of being too harsh or conservative in filtering risky content are lower.
You probably can’t share any sort of nudity or gore or hate speech on a professional networking app or an educational site for children.
But in many more instances, and for many more platforms, bad content is easy to spot. Some policy questions, like what to do with the President’s tweets or how to define hate speech, have no right answer.
Or do they? Can an AI handle moderation instead? It’s a dirty job, but someone’s got to do it. As RadioLab put it in their excellent podcast episode on the topic, “How much butt is too much butt?” Questions like these are tough enough, and then, if you’re Twitter, you have to decide what to do when the President’s tweets violate your Terms of Service. The moderation team there was responsible for the near-impossible task of drawing the line between which messages counted as riské flirtation (usually ok), illicit come-ons (possibly ok), and sexual harassment (which would get you banned). I’ve been fascinated by the topic of moderation–deciding who gets to post what on the internet–ever since I started working at the online dating site OkCupid, five years ago. In this post, I’ll show you how to build an AI-powered moderator bot for the Discord chat platform using the Perspective API.