NB! This article may contain offensive content such as racism, sexism, and other discriminatory or otherwise unfair descriptions.
Click on figures to display in interactive mode.
Klikk her for å lese blogginnlegget på norsk
Abdera’s goal of online comment moderation is a deceivingly simple task, and requires a more comprehensive approach than to simply use a word filter to distinguish “good” words from “bad” words. While this method is used by many large social media pages it has several limitations: it is not robust enough to handle linguistic inconsistencies such as dialect, typos, and variations in spelling, it can miss harmful content with words that are not in the filter list, and it can produce false positives by being overly sensitive to certain words. As a result, human moderation has been a necessity for most pages aiming to continue having open comment sections.
At Abdera, we are committed to developing advanced AI-based moderation and analysis tools. We recognize the importance of online comment sections in public debates, but as these discussions become more accessible, the likelihood of inappropriate content increases. The rise of generative AI like ChatGPT and Dall-E may only accelerate this trend.
However, we believe that the response of closing comment sections by large media outlets is counterproductive. This approach pushes users towards smaller forums and echo chambers, stifling the healthy exchange of diverse opinions.
We believe that the value of effective moderation lies not only in detecting and removing harmful content, but also in understanding how social media users engage with and respond to different types of content — making the continual process of public feedback and media response function as it should.
In this blog post, we have organized a set of interactive graphs that summarize 40,000 anonymized social media comments in Norwegian (with annotations of hate speech) by their topics and key words. For the sake of privacy names of commenters are replaced with NAVN (making this a common key word).
The final topic structure reveals some surprising, and many unsurprising, results. Unsurprisingly, figure 1 relates topics such as “101_sosialisme_sosialist_sosialister_eus_”, ”103_diktatur_demokrati_demokratiet_et”, and ”76_israel_palestinere_palestinerne_palestina”: all concerning commonplace political subject matters. These topics are again related to a category of political figures ”thunberg_listhaug_greta_raja_sylvi” and, a bit more surprisingly, to a lesser degree ”69_mgp_mgp2020_finalen_mgpfinalen”. The hierarchy connect subject matter such as heavily debated political events, news articles, and common insults (e.g. ”74_fitte_tatoveringer_din_hore” and ”93_mora_mamma_moren_mi”). Visualisations such as these allow for a quick understanding of big data and to quickly spot outliers (such as completely irrelevant content or spam), and categorise aptly.
Figure 2 visualises these trends by their relative distance. We observe, for instance, that ”5_islam_muslimer_muslim_er” is fully overlapping ”2_rasist_innvandring_er_nordmenn”. Perhaps more interesing are the non-overlapping topics: the topics that are related more indirectly. For instance, topics 13 & 7 concerning wealth and NAV (welfare administration), and topics 5 & 2 — directly illustrating a relation of the topics Islam, immigration, and religion with welfare. Similar is the case with police, justice system, drugs, Islam, and immigrants, as well as environment, meat, petroleum, MDG, and FrP.
In order to properly recognise trends, we need to establish a baseline of user feedback. Figure 5 shows a mapping of our data with topics as coloured clusters and each comment as a datapoint. Nearby comments are set to be similar in its content. By employing this as our baseline, we may map any arbitrary comment section establishing the general response to the topic at hand.
We draw a random online comment section of 386 comments and compare our new results to familiar controversial topics. It could allow us to analyse commenters’ main points of interest, spread of the comments’ content, and comment threads’ developments over time (e.g. tracing threads from political discussions all the way to personal attacks).
Figure 6 illustrates our sampled comment section compared to known hateful comments. We find that the comment section is not significantly affected by the most controversial topics. Its topics include illness, psychiatry, drugs, and replies (through the key word “NAVN”). A potential shift towards other, irrelevant topics would be quickly recognised and, if desired, intervened by an automated moderator or notify a human moderator.
By locating trends in a visual way, one is able to quickly evaluate comment sections as well as how its response changes with time. Thus, one may better determinine when and how content in public debate fora approach controversial topics and follow the process of escalation in real-time. Anonymised user data may be tracked over time and used in a predictive setting, generating useful feedback for both the viewers and commenters, as well as for authors and moderators on the administrator side. Content moderation may be enhanced by providing topic and commenter trends along with the potentially harmful content at hand, thus allowing for sorting as well as prioritisation of harmful content. We want to simplify moderation, all the while improving the connection to one’s audience.
Medietilsynet (2021) Delrapport 2: Sjikane via internett, og konsekvenser dette har for demokrati og deltakelse. Available at: https://www.medietilsynet.no/globalassets/dokumenter/rapporter/2021-kritisk-medieforstaelse/210427-kmf-2021-delrapport-2-sjikane-og-hat.pdf (Accessed: 10 January 2023).
Reimers, Nils, og Iryna Gurevych. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv, 27. august 2019. arXiv.org, http://arxiv.org/abs/1908.10084.
Grootendorst, Maarten. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv, 11. mars 2022. arXiv.org, http://arxiv.org/abs/2203.05794.