Abdera - Healty conversations powered by AI

Visualising Comment Sections

By Erland Grimstad on Apr 14, 2023

NB! This article may contain offensive content such as racism, sexism, and other discriminatory or otherwise unfair descriptions.
Click on figures to display in interactive mode.
Klikk her for å lese blogginnlegget på norsk

Hierarchical clustering structure of comment section topics.
Fig. 1) A hierarchical tree structure of topics present in our comment dataset. Topics (the tree's leaf nodes) are named by: index number (ranked 0, 1, 2, ... from the most common topic) followed most significant key words, e.g. "1_most_significant_less_important" would be the 2nd most common category.

Abdera’s goal of online comment moderation is a deceivingly simple task, and requires a more comprehensive approach than to simply use a word filter to distinguish “good” words from “bad” words. While this method is used by many large social media pages it has several limitations: it is not robust enough to handle linguistic inconsistencies such as dialect, typos, and variations in spelling, it can miss harmful content with words that are not in the filter list, and it can produce false positives by being overly sensitive to certain words. As a result, human moderation has been a necessity for most pages aiming to continue having open comment sections.

At Abdera, we are committed to developing advanced AI-based moderation and analysis tools. We recognize the importance of online comment sections in public debates, but as these discussions become more accessible, the likelihood of inappropriate content increases. The rise of generative AI like ChatGPT and Dall-E may only accelerate this trend.
However, we believe that the response of closing comment sections by large media outlets is counterproductive. This approach pushes users towards smaller forums and echo chambers, stifling the healthy exchange of diverse opinions.

We believe that the value of effective moderation lies not only in detecting and removing harmful content, but also in understanding how social media users engage with and respond to different types of content — making the continual process of public feedback and media response function as it should.

In this blog post, we have organized a set of interactive graphs that summarize 40,000 anonymized social media comments in Norwegian (with annotations of hate speech) by their topics and key words. For the sake of privacy names of commenters are replaced with NAVN (making this a common key word).

A mapping of topics such that similar topics are near.
Fig. 2) Illustration of topic frequency (by circle size) and similarity (by their distance to surrounding circles).

The final topic structure reveals some surprising, and many unsurprising, results. Unsurprisingly, figure 1 relates topics such as “101_sosialisme_sosialist_sosialister_eus_”, ”103_diktatur_demokrati_demokratiet_et”, and ”76_israel_palestinere_palestinerne_palestina”: all concerning commonplace political subject matters. These topics are again related to a category of political figures ”thunberg_listhaug_greta_raja_sylvi” and, a bit more surprisingly, to a lesser degree ”69_mgp_mgp2020_finalen_mgpfinalen”. The hierarchy connect subject matter such as heavily debated political events, news articles, and common insults (e.g. ”74_fitte_tatoveringer_din_hore” and ”93_mora_mamma_moren_mi”). Visualisations such as these allow for a quick understanding of big data and to quickly spot outliers (such as completely irrelevant content or spam), and categorise aptly.

Figure 2 visualises these trends by their relative distance. We observe, for instance, that ”5_islam_muslimer_muslim_er” is fully overlapping ”2_rasist_innvandring_er_nordmenn”. Perhaps more interesing are the non-overlapping topics: the topics that are related more indirectly. For instance, topics 13 & 7 concerning wealth and NAV (welfare administration), and topics 5 & 2 — directly illustrating a relation of the topics Islam, immigration, and religion with welfare. Similar is the case with police, justice system, drugs, Islam, and immigrants, as well as environment, meat, petroleum, MDG, and FrP.

A similarity graph displaying a grid of pairwise similarity.
Fig. 3) The difference between any two topics given by a similarity score between 0 and 1 (all topics have a self-similarity of 1). Click on image for interactive map and mouse over to examine pairwise similarity.
Topics categorised into hateful and not hateful comments by their topics.
Fig. 4) Depicts topic differences distinguishing hateful comments from non-hateful comments, looking at the 10 most common topics.
Figure 4 displays trends in hate speech topics. As expected, the topics at hand are related to characteristics including muslims, women, and immigrants. This allows us to extract differences in hateful and non-hateful comments, even within highly controversial topics. In contrast, word filters would never be able to distinguish between constructive and non-constructive comments to the same degree.

Making the most of our data

With an improved tool for mapping user-generated feedback in real-time we reason that the debate climate may be more easily organised by the comment section host and guided towards relevant debates, rather than the current trend of spiralling into completely irrelevant topics. In addition user response may be better predicted and analysed over time and as such make moderation needs more predictable and shift focus towards proactive moderation. Lastly, having an accessable database of easily identifiable traits of social media feedback and debate could help build a two-way relation with one's audience without spending too much time keeping track of one's comment section.

EXAMPLE: Mapping a New Comment Section

In order to properly recognise trends, we need to establish a baseline of user feedback. Figure 5 shows a mapping of our data with topics as coloured clusters and each comment as a datapoint. Nearby comments are set to be similar in its content. By employing this as our baseline, we may map any arbitrary comment section establishing the general response to the topic at hand.

We draw a random online comment section of 386 comments and compare our new results to familiar controversial topics. It could allow us to analyse commenters’ main points of interest, spread of the comments’ content, and comment threads’ developments over time (e.g. tracing threads from political discussions all the way to personal attacks).

Displaying relative comment similarity on top of their categorisation.
Fig. 5) A mapping of comments (shown as data points) with their respective topics (given by colour).
Mapping a novel comment section onto the relative similarity map.
Fig. 6) A mapping of comments with overlay of hateful comments and comments from a novel comment section.
Interactive: Click to toggle hate speech / new comments below the figure.

Figure 6 illustrates our sampled comment section compared to known hateful comments. We find that the comment section is not significantly affected by the most controversial topics. Its topics include illness, psychiatry, drugs, and replies (through the key word “NAVN”). A potential shift towards other, irrelevant topics would be quickly recognised and, if desired, intervened by an automated moderator or notify a human moderator.

By locating trends in a visual way, one is able to quickly evaluate comment sections as well as how its response changes with time. Thus, one may better determinine when and how content in public debate fora approach controversial topics and follow the process of escalation in real-time. Anonymised user data may be tracked over time and used in a predictive setting, generating useful feedback for both the viewers and commenters, as well as for authors and moderators on the administrator side. Content moderation may be enhanced by providing topic and commenter trends along with the potentially harmful content at hand, thus allowing for sorting as well as prioritisation of harmful content. We want to simplify moderation, all the while improving the connection to one’s audience.

SOURCES

Medietilsynet (2021) Delrapport 2: Sjikane via internett, og konsekvenser dette har for demokrati og deltakelse. Available at: https://www.medietilsynet.no/globalassets/dokumenter/rapporter/2021-kritisk-medieforstaelse/210427-kmf-2021-delrapport-2-sjikane-og-hat.pdf (Accessed: 10 January 2023).

Reimers, Nils, og Iryna Gurevych. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. arXiv, 27. august 2019. arXiv.org, http://arxiv.org/abs/1908.10084.

Grootendorst, Maarten. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv, 11. mars 2022. arXiv.org, http://arxiv.org/abs/2203.05794.