Blog Post on the Short Term Scientific Mission at the Dutch Language Institute

Linking lexicographic resources and CEFR-graded vocabulary lists

Dates: April 3, 2022 to April 12, 2022

Duration: 10 days

Applicant: Ilan Kernerman

Venue: Leiden, The Netherlands

Host Institution: Dutch Language Institute

Host: Carole Tiberius

Involved WGs: WG4

DESCRIPTION

The objective of the STSM was to explore linking lexicographic data to difficulty graded vocabularies, aimed to enhance the development of language teaching and training materials. Our main goals and outcomes can be summarized as follows:

To study best methods to link lexicographic information to the difficulty of words and phrases for different learner levels – we decided to focus on the CEFR grading system and identified major challenges.
To explore how to integrate such information with language learning applications – we made first tests on linking lexicographic resources with a CEFR list in preparation of incorporating such merged information in language learning applications.
To plan the upload of by-products to the Linguistic Linked Open Data cloud – we began studying the legal implications of offering privately-owned data on the LLOD cloud to find the most appropriate form of license.
To contribute to NexusLinguarum COST Action – by the means of a comprehensive assessment of the feasibility and usefulness of the above-described linking, including multilingual applications and eventual upload to the LLOD cloud.
To plan a joint paper for presentation – at the Nexus-supported conference ‘LLOD approaches for language data research and management’, due in Vilnius on September 21-22, and for subsequent publication in the journal Rasprave.

We will continue to explore options related to this STSM for launching a wide-range project that brings together academia-industry lexicographic, language teaching, linked data and knowledge communities, in the framework of Horizon Europe or Digital Europe programs starting in 2023.