Abusive language dataset annotation workshop
Meeting Dates: 28th September 2021
Venue: Skopje, North Macedonia (& online)
Organizing committee: Barbara Lewandowska-Tomaszczyk, Slavko Žitnik, Anna Bączkowska, Giedre Valunaite Oleskevicienė, Chaya Liebeskind, Jelena Mitrovič, Kristina Despot
Image above by Julian Nyča – no changes. Licence: CC BY-SA 3.0. https://de.wikipedia.org/wiki/Skopje#/media/Datei:Skopje_%E2%80%93_Vardar_River.jpg
NexusLinguarum organized a workshop on Abusive language dataset annotation, 28 September 2021, which took place as a hybrid event in Skopje, North Macedonia.
|9:00 – 11:00||Barbara Lewandowska-Tomaszczyk
Presentation of insights and needs for a new offensive language corpora annotation – new taxonomy.
Annotation guidelines with examples of annotations – each participant will receive printed guidelines. (prepared by Barbara Lewandowska-Tomaszczyk, Slavko Žitnik, Anna Bączkowska, Giedre Valunaite Oleskevicienė, Chaya Liebeskind, Jelena Mitrovič)
Searching for the translations of the offensive language categories from English to native languages of participants – Google Sheet.
|11:00 – 11:30||Coffee break|
|11:30 – 13:30||Slavko Žitnik
Offensive language corpora overview and preparation of data for the annotation campaign.
General introduction into annotation campaigns (annotation tools, process, annotators, curators, monitoring, inter-rater agreement)
Hands-on tutorial on how to use INCEpTION annotation tool for the task of offensive language annotation – each participant will receive a username and password.
|13:30 – 14:30||Lunch break|
|15:00 – 17:00||Barbara Lewandowska-Tomaszczyk, Anna Bączkowska (online)
Giedre Valunaite Oleskevicienė, Slavko Žitnik
Live annotation of the prepared corpora using the INCEpTION tool.
Offering guidance and support during the annotation, discussing different options for the annotation – may result in annotation guidelines update.
|17.00 – 17.30||Coffee break|
|17.30 – 18.30||Slavko Žitnik
Review of annotation activities during the workshop.
Gathering interest and plan by the interested participants – how much time are they prepared to invest for the annotation after the workshop. Decision on deadlines, collaborative communication means – mailing list.
Open discussion and closing.