DETOXIS-IberLEF 2021 Task: DETOXIS (DEtection of TOxicity in comments In Spanish)
CALL FOR PARTICIPATION on the DETOXIS Task
The aim of the task is the detection of toxicity in comments posted in Spanish in response to different online news articles related to immigration.
The DETOXIS task is divided into two related classification subtasks:
- Subtask 1: Toxicity detection task is a binary classification task that consists of classifying the content of a comment as toxic (toxic=yes) or not toxic (toxic=no).
- Subtask 2: Toxicity level detection task is a more fine grained classification task in which the aim is to identify the level of toxicity of a comment (0= not toxic; 1= mildly toxic; 2= toxic and 3: very toxic).
Although we recommend to participate in both subtasks, participants are allowed to participate just in one of them (e.g., subtask 1).
Teams will be allowed (and encouraged) to submit multiple runs (max. 5).
A comment is toxic when it attacks, threatens, insults, offends, denigrates or disqualifies a person or group of people on the basis of characteristics such as race, ethnicity, nationality, political ideology, religion, gender and sexual orientation, among others. This attack can be expressed in different ways –explicitly (through insult, mockery and inappropriate humor) or implicitly (for instance through sarcasm)– and at different levels of intensity, that is at different levels of toxicity (from impolite and offensive comments to the most aggressive, the latter being those comments that incite hate or even physical violence). We use toxicity as an umbrella term under which we include different definitions used in the literature to describe hate speech and abusive, aggressive, toxic or offensive language. In fact, these different terms address different aspects of toxic language.
The detection of toxicity, and especially its classification in different levels, is a difficult task because the identification of toxic comments can be determined not only by the proper linguistic content (what is being said and the way in which it is conveyed), but also by the contextual information (i.e., conversational thread) and the extralinguistic context, which is related to real-world knowledge.
The presence of toxic messages on social media and the need to identify and mitigate them leads to the development of systems for their automatic detection. The automatic detection of toxic language, especially in tweets and comments, is a task that has attracted growing interest from the NLP community in recent years.
DETOXIS is the first task that focuses on the detection of different levels of toxicity in comments posted in response to news articles written in Spanish.
· Important dates:
Training dataset release: March 1, 2021
Test dataset release: April 22, 2021
Systems results: May 10, 2021
Results notification: May 17, 2021
Working papers submission: June 2, 2021
Working papers (peer-)reviewed: June 15, 2021
Camera-ready versions: July 5, 2021
· Task organisers:
- Mariona Taulé, Montserrat Nofre, Alejandro Ariza (Universitat de Barcelona, UB)
- Enrique Amigó (Universidad Nacional de Educación a Distancia, UNED)
- Paolo Rosso (Universitat Politècnica de València, UPV)
Contact the organizers by writing to: email@example.com
We invite participants to join the Google group in order to be kept up to date with the latest news related to the task: firstname.lastname@example.org