A Touch of Humanity: Why we are using annotators to train A.I. algorithms

February 3, 2022 — LISA DE SMEDT

Made to be Hateful

The Darkness of the Web

The contemporary online sphere, especially social media platforms, has increased in toxicity over the past couple of years. On all fronts, hate speech and fake news are allowed to run rampant and spread like wildfire. A significant number of users of the platforms bear the conviction that everybody has the right to express themselves however they like. While this is certainly true, the consequences of free expression are often disregarded. Freedom of expression is a necessary right, but it does not exempt the expressor from the consequences of their expressions. In a number of European countries, for example, it is illegal to express Holocaust denial. More specifically, one is free to deny the Holocaust publicly but the consequence of that is that one will be prosecuted for exclaiming such a belief.

Recently, it has become apparent that hateful speech online is not without consequences either. In the past two years, online toxicity has been shown to have repercussions in material life. Examples can be seen with the Capitol riots in the beginning of 2021 in the U.S.A., the shootings in Plymouth, U.K., by a man who had previously expressed misogynistic and homophobic tendencies online, , and the Belgian ex-military who disappeared in 2021 due to being riled up by conspiracy theories floating around social media along with inherent extreme right influences. Since the start of the COVID-19 pandemic, disinformation has also been spreading at an unprecedented pace, with theories that 5G was causing the virus leading to telemasts being burned down, individuals likening COVID measures to the Holocaust wearing yellow Star of David’s in public, and racist rhetoric of the coronavirus causing anti-Asian hate to increase by 150% in the U.S. . Are we as a species becoming more radicalised on our own or are there external factors at play?

Inhumanely Artificially Intelligent

It has come to light that these ‘radicalisations’ have come about due to an online phenomenon called echo-chambers. The term echo-chambers refers to the bubbles of mutual affirmations that have taken the front page in recent years. In more detail, the mechanism of echo chambers roughly works as follows:

You may be scrolling through facebook and you come across a post by a like-minded friend; you agree with the statement so you click “like” or you comment and interact with them.
The following day the social media platform (e.g. Facebook) gives you some suggestions of pages you might like containing similar posts to the one you ‘liked’, so you ‘like’ the suggested page.
The platform then suggests some groups of similar content you may like, so you join them and start to interact with other members of the group.
You soon find that these others are like-minded to yourself, so you start sharing your beliefs.
These beliefs get affirmed by more group members and you start to feel more convinced that your beliefs are the ‘right’ ones.
You start joining more like-minded groups because they are suggested to you on your own home page on the platform in question.
You have now become part of a whole network of pages and groups of people that share your convictions and affirm, or echo, each other's beliefs.

What is lacking in this exemplary sequence of events, is that there barely, or never, occurs a moment where someone opposes your statement. There may be an occasional opponent to your beliefs, but this person is usually quickly shunned by the entire group, which results in this person, and their different beliefs, leaving the group altogether.

This particular detail is what makes echo-chambers the ideal mechanism for radicalisation. By dismembering all possibilities for opposition, one effectively ensures that all members of the group maintain strong like-minded convictions and are therefore easily convinced to take action in some form. This could be harmless, for example a peaceful protest, even if the protestors’ motivations are scientifically unfounded. However, this could also result in riots and killing.

What is so disconcerting about this sequence of events, is that they are not steered by the joint group members. These are people everybody knows and comes across everyday, never expecting them to become so radicalised. You may have noticed that there is a silent driver behind these progressions, in the form of ‘suggested’ material personalised to the user of the platform. This suggested material is powered by the individual platforms’ Artificial Intelligence (A.I.) algorithm that works by playing into the users’ individual preferences based on ‘likes’ and interactions like posts and comments. This is also known as personalisation.

These algorithms are part of the reason that A.I. has been getting such a bad reputation because the platforms seldom disclose how the algorithm works exactly and why it makes the decisions that it does. The problem with this is that the algorithms are infused with bias without the passive users knowing it. Therefore, it has become clear that these personalisation practises do more harm than good, and need to be thoroughly re-evaluated.

Light in Sight

European Observatory of Online Hate

Due to these recent developments both online and offline, the European Commission decided it was high time for some counter initiatives to be taken. For this reason, a group of civil society organisations and Textgain, a software company that specialises in text processing using A.I. algorithms, embarked on a long-term project, namely the European Observatory of Online Hate (EOOH). The primary goal of EOOH is to monitor online trends via the automated detection of hate speech such as racism, terrorist threats, neo-nazism, LGBTQI+ directed toxicity, misogyny, and more. The information gathered through automated means is tracked and fed back to an expert group, allowing the development of more effective counter narratives and policies to combat these phenomena.

A.I. For Good

The EOOH dashboard provides practitioners with an overview of the current trends, as well as a tool for amassing potential evidence, case studies, and general research on newly developing trends. Textgain’s A.I. algorithm is working around the clock to fill the dashboard with relevant data that is automatically detected to contain hate speech and/or misinformation. A.I. algorithms have gained a bad image lately due to the ‘black box’ nature of their design. Opposing this feature is exactly what Textgain strives for whenever the company designs and deploys A.I. algorithms. For Textgain, Explainable AI is not just a buzz word or luxury, it is a necessity.

In order to do this, Textgain works with Hate Speech lexicons, designed and composed entirely by humans, that provide the A.I. algorithm with a baseline of expressions to look out for. This enables the A.I. algorithm to detect similar expressions within the vicinity of the ones already incorporated in the lexicon, and therefore expand the lexicons of all 24 languages automatically, as the scale of the problem is too large to leave to manual monitoring alone. But in the end, it is always up to a person to determine whether an utterance contains harmful hate speech or not, a design process called human in the loop.

Uncovering Lexical Hate with Interactive Kindness

The lexicons are designed by a group of trained individuals with a diversified background, consisting of at least two annotators for each of the official European languages, so as to significantly reduce the bias factor of the lexicons and, in extension, the algorithm.

During this process, the annotators are required to fill a lexicon with at least 3000 expressions that are either hate speech or often occur in a toxic context (e.g. names of politicians). These expressions are attributed to different hate speech categories, which have been proven as an efficient tool for automated hate speech detection. Additionally, the annotators attribute a toxicity score to each of the expressions, indicating the level of harmfulness they express.

During weekly Annotation Clinics, the annotators were inspired by each other to adopt different perspectives or investigate different trends. This led to the discovery of a resurgence of neo-nazism running parallel to a resurgence of ancient Greek mythology through conspiracy theories that have existed since the 1960s. One annotator stumbled upon old government documents containing such beliefs dated from around that time. The annotation clinics provided the human touch we needed to train the A.I. algorithms to the best of our collective abilities.

Greasing the Wheels

Experts from across Europe will be given the opportunity to work on the dashboard creating channels in the topics they are interested in. Stakeholder groups include policy makers, law enforcement, civil society, and academia, who will be using our monitoring tool to maximise cutting edge AI to gather evidence and examples to help counter hate and disinformation in their respective fields. Roundtables and in-person meetings will allow the sharing of best practises of research using AI to deepen understanding of the nature and patterns of hate in their areas of expertise.

Reaching Out Positively

Gathering this data not only allows us to understand and observe hate online, but it also gives us the ability to go one step further and use it to create positive and alternative narrative campaigns. Through building resilience to hate and disinformation among young people by presenting research findings in accessible language and providing these positive, alternative narratives, these campaigns aim to slow the spread of online hate in general, as well as providing the public with some hope - it’s not all doom and gloom out there on the internet.

Tips to combat online hate

Be aware that online hate is always trying to reach a bigger stage
By sharing or retweeting hateful content, this ensures that it spreads further. Try not to respond with angry emojis to hateful content, as this increases visibility on social media. Online hatred is not decisive for social debate or public opinion. Hate speech is used deliberately by a select group of people to steer the debate in a certain direction.
Watch out for fake news
Is an article inflammatory and divisive? Then pay attention! Many fake news messages are designed specifically to polarise. You can find tools online to help you check facts, such as Fact Check Tools, EU Fact Check, Poynter, and many more. When you encounter fake news and wish to respond, be sure to include some of these empathy, factual, and humourous attributes in your message.
Report inappropriate content
Fake news and hate speech can also be reported to social media platforms. Illegal content and comments can also be reported to the police.
Intervene when you see discrimination
Talk to the perpetrator with positive norms and values. Try to "activate" these norms and values in the perpetrator by talking from a group perspective.
Combat hate positively, but briefly!
Be as brief and concise as possible when you respond to hate. Long discussions may attract other haters. Be aware that if an account mainly shares hate and inflammatory content then you are probably not able to change their opinion, but you do provide the hater with extra oxygen by arguing.

To hear more about EOOH and get involved in our positive alternative campaigns, you can sign up to our mailing list here.