Hi
I am a computer science student and am just starting my masters thesis. My focus will be on content moderation (algorithms) and therefore I am currently exploring how some social media applications moderate content.
If I understand the docs correctly, content moderation on mastodon is all manual labor? I haven’t read anything about automatic detection of Child Sexual Abuse Material (CSAM) for example which is a thing that most centralised platforms seem to do.
Another question which kind of goes in the same direction is reposting of already moderated content. For example a racist meme that was posted before. Are there any measures in place to detect this?
Thank you for your help!
I would encourage you to join !KHrhpVwWgHNciqCMTP:matrix.iftas.org
Lemmy (a Fediverse alternative to Reddit) does have a community driven tool Fedi Safety for detecting and deleting CSAM. There are some instances that use it but I don’t have any statistics on that.
If you haven’t already, you’ll probably want to read this study on CSAM from Standford which discusses the lack of automated tooling (and how PhotoDNA isn’t really equipped to handle thousands of Mastodon servers).
Some other Stanford researchers (including Alex Stamos, former CSO for Facebook) just put out this piece on Mastodon moderation too, which is worth a look.
When considering moderation, it’s also worth thinking about the role of defederation. From the first report, the CSAM on Mastodon sounds like it’s mostly on Japanese servers, and most western-oriented servers have defederated from them over that, so the content won’t travel across the network. I know that if that appeared on my server, the admins would probably defederate until the source cleaned up their act.
Tank you for your response. How could I not have found this study myself. Pretty sure that it will answer a lot of the questions I have and probably will have while digging deeper into the subject. This really helps me.
I will now have a read of the study and blog post of Mr. Stamos. It is also good to know that server admins and moderators seem to act fast on those issues. Moderation is always a delicate thing but when it comes to illegal content it is really a must have.
The apis exists to perform moderation via a separate application, so in theory automated content moderation is possible, but most servers are relatively small and funding limited. This means there is little motivation to develop or adopt automated moderation.
The primary mechanism of moderation if by excluding people who break rules. Someone on your server breaks a rule, you ban them. They move somewhere else, they break your rules there and the mods over there don’t care, so you exclude that instance from your server. (Defederate)
It’s a simple but pretty effective system.
I think a kind of spider style bot that can be used to flag instances with high levels of rule breaking to suggest ban lists might be effective, but all automoderation comes with some level of error and we should be wary.
If APIs exists to perform moderation via a separate application, would it not be worthwhile for instances, or even concerned individuals and companies, to work together and create a single, common Mechanical Turk-style, or even (dare I say it) AI-based moderation SaaS that could also tie into the PhotoDNA system at scale?
Maybe, but who’s paying? And why are they releasing it to the public for free?
Companies want to make money. Individuals want to spend as little as they can.
Many Companies are also more recently pushing back against opensource because they can’t license how they want to to maximize profit. So there is a few companies who would pay for such a thing and all it opensource but not many and they have other priorities.
Maybe bluesky or whatever will care enough…
What do you mean? There’s tools out there already. Lemmy had one made recently that helps with CSAM. Plus the Fedi is always touting donations as the way so why wouldn’t people donate if funds are needed?
Thank you for your reply. Neat that there are already APIs which technically would enable some kind of automated moderation. But I understand that automation is critical. Especially since ML Algorithms have problems with context.