CALL FOR PARTICIPATION
IberLEF 2022 Task: DETESTS (DETEction and classification of racial Stereotypes in Spanish)
This task will take part of IberLEF 2022, the 4th Workshop on Iberian Languages Evaluation Forum at the SEPLN 2022 Conference, which will be held in A Coruña, Spain, on September 20th.
The aim of the task is to detect and classify stereotypes in sentences from comments posted in Spanish in response to different online news articles related to immigration. The task is designed in a hierarchical fashion by chaining two subtasks and allowing participants to either model the simple binary scenario (a stereotype is present or not) or complete the entire pipeline by modeling the complex multi-label classification problem (different types of stereotypes). Next, a description of both subtasks is provided:
- Subtask 1: Participants tackling this problem will have to determine whether a sentence contains at least one stereotype (positive example) or none (negative example) considering the full distribution of labels provided by the annotators based on the proposal of learning with disagreements. The actual gold label of this subtask is left as a proxy to determine the subset of sentences that will be evaluated in the posterior subtask.
- Subtask 2: This subtask consists of determining whether a sentence contains at least one stereotype or none and assigning those sentences previously marked as positive (with stereotypes) to ten categories that present immigrants as: 1) ‘victims of xenophobia’, 2) ‘suffering victims’, 3) ‘economic resources’, 4) a problem of ‘migration control’, 5) people with ‘cultural and religious differences’, 6) people which take ‘benefits’ of our social policy, 7) a problem for ‘public health’, 8) a threat to ‘security’, 9) ‘dehumanization’ and 10) ‘other’ types of stereotypes. Since a sentence can contain multiple stereotypes belonging to different categories, this subtask will be presented as a multi-label hierarchical classification problem.
Although we recommend participating in both subtasks, participants are allowed to participate just in one of them (e.g., subtask 1).
Teams will be allowed (and encouraged) to submit multiple runs (max. 5).
The present task is proposed to participants interested in racial, national, or ethnic stereotype detection and classification tasks, which is a relevant and relatively novel area of research due to its impact on modern society. Furthermore, the annotated dataset is a valuable resource for exploratory linguistic analysis, as well as for comparing the application of deep learning and classical machine learning models on Spanish stereotyped expressions under the recently introduced learning with disagreements paradigm. Participants will be provided with the annotated data by each of the annotators and the gold standard.
Linguistic resources:
Our DETESTS corpus is made up of two parts – a subset of the NewsCom-TOX corpus and the StereoCom corpus. Both corpora consist of comments (at least 50) segmented onto sentences published in response to manually selected articles extracted from Spanish online newspapers. The common topic of all articles is immigration.
The DETESTS corpus consists of 5,629 sentences. We will provide participants with 70% of the dataset to train their models, while the remaining 30% will be used to test their models.
To avoid any conflict with the sources of the comments regarding their intellectual property rights (IPR), the data will be sent privately to each participant who is interested in the task. The corpus will only be made available for research purposes.
Important dates (All deadlines are 11:59 PM UTC-12:00):
Training dataset release: March 21, 2022 (Already available at the website)
Test dataset release: April 20, 2022
Systems results: May 16, 2022
Results notification: May 23, 2022
Working papers submission: June 9, 2022
Working papers (peer-)reviewed: June 20, 2022
Camera-ready versions: July 4, 2022
Workshop at IberLEF 2022: September 20, 2022
Task organizers:
- Mariona Taulé (Universitat de Barcelona, UB)
- Wolfgang Schmeisser (Universitat de Barcelona, UB)
- Alejandro Ariza (Universitat de Barcelona, UB)
- Montserrat Nofre (Universitat de Barcelona, UB)
- Enrique Amigó (Universidad Nacional de Educación a Distancia, UNED)
- Paolo Rosso (Universitat Politècnica de València, UPV)
- Berta Chulvi (Universitat Politècnica de València, UPV)
Contact:
Contact the organizers by writing to detests.iberlef@gmail.com.
We invite participants to join our Google Groups to be kept up to date with the latest news related to the task.
For more information, please visit our website detestsiberlef.wixsite.com/detests.
Sponsors:
European project 'STERHEOTYPES-Studying European Racial Hoaxes and Sterheotypes' funded by 'Challenge for Europe' call for Project, Compagnia San Paolo (CUP: B99C20000640007).
Grant XAI-DisInfodemics: IA explicable para desinformación y detección de conspiración durante infodemias (PLEC2021-007681) funded by MCIN/AEI/10.13039/501100011033 and, as appropriate, by the “European Union NextGenerationEU/PRTR”.