Blog Post on “The Working Group 2 journey so far”

»The Working Group 2 journey so far«

The Working Group 2 of NexusLinguarum is in charge of using Linguistic Linked Open Data (LLOD) to improve different natural language processing tasks, such as knowledge extraction, machine translation and question answering. The internal structure of WG2 has been organised according to the different tasks in which LLOD are being applied:

T2.1 is focused in LLOD and Knowledge Extraction
T2.3 is focused in LLOD and Machine Translation
T2.3 is focused in LLOD and Question Answering
T2.4 is focused in LLOD and Word Sense Disambiguation
T2.5 is focused in LLOD and Terminology

The coordination of the different tasks is lead by institutions from different countries in Europe: Universidad Politécnica de Madrid (Spain), NUI Galway (Ireland), University of Coimbra and Universidade Nova de Lisboa (Portugal), Aalto University (Finland), Sapienza University of Rome (Italy) and Uni Bielefeld (Germany).

The multicultural nature of this working group offers potential collaboration amongst these institutions that could be performed as joint research publications and even research stays (STSMs). So far, in WG2 we have made one 3-months STSM between UPM (visitor) and DFKI (host). This translates into a collaboration amongst different working groups (WG1 and WG2).

The STSM took place in Saarbrücken, a beautiful city in the west of Germany. Due to the pandemic, the “social conditions” were not the best, but it was very prolific in terms of research outcomes.

DFKI Saarbrücken (Image from Wikimedia)

Even though there was only one STSM carried out, the collaboration amongst working groups does not end there. Once the STSM was finished, WG1 and WG2 continued discussing the terminology modelling proposal, and members of both groups are now working on a paper to present the results. Similarly, WG2 and WG4 maintain a twofold collaboration that resulted in two publications: the first one focused on the application of NLP and Semantic Web in a humanities use case, and the second one consists of a survey on LLOD and NLP methods to detect and represent semantic change.

Also, the collaboration amongst WG2, WG3 and WG4 has been materialised in two WG4 STSMs on the topics of discourse markers, machine learning, multilingual corpora and metaphors in Covid 10 related corpora.

On the other hand, several members of the working group have been in charge of organising conferences, workshops and similar events. During this time, three events have been organised:

Bielefeld’s Hackathon, as part of task 2.3, on multilingual question answering, co-organised by the University of Naples L’Orientale, Italy. More info here.
Lisbon Training School, as part of task 2.5, on terminology, lexicography and Linked Data, with the help of WG1 and WG3 co-leaders. More info here.

As future directions, WG2 is already working in several papers inside the different tasks and on a general paper combining some of them. This paper is thought to be a state of the art on “LLOD in Terminology and Knowledge Management. Also, T2.1 is planning a series of Pizza Lunch Seminars, as a co-working activity; T2.2 is organising a shared task on machine translation for Dravidian languages; T2.4 has proposed a reading group to foster the collaboration with other WG’s and several members of the group are organising the 4th Summer Datathon on LLOD that will take place in Cercedilla (Spain) in June 2022.

Residencia Lucas Olazábal of Universidad Politécnica de Madrid, in Cercedilla, (Image from https://residencialucasolazabal.es/?lang=en)