PROfiling LINGuistic KNOWledgE gRaphs (ProLingKNOWER)
Meeting Dates: 23rd May 2022
Venue: Jerusalem, Israel (& online)
Organizer Institution(s):
Local Organizer(s): Chaya Liebeskin, Jerusalem College of Technology, Jerusalem
Organizing committee: Blerina Spahiu – Università degli Studi di Milano – Bicocca, Italy; Milan Dojchinovski – Czech Technical University in Prague / DBpedia Association, Germany; Penny Labropoulou – Institute for Language and Speech Processing/R.C. “Athena”, Greece; Vojtěch Svátek – Prague University of Economics and Business, Czech Republic
Title: PROfiling LINGuistic KNOWledgE gRaphs (ProLingKNOWER)
(Jerusalem local time) (GMT+3): 23rd May 2022
9:00 – 9:30 Welcome
9:30 – 10:00 Profiling Linguistic Knowledge Graphs – Blerina Spahiu, Renzo Alva Principe and Andrea Maurino
10:00 – 10:20 Effect of heuristic post-processing on knowledge graph profile patterns: cross-domain study – Gollam Rabby, Farhana Keya, Vojtěch Svátek and Renzo Alva Principe
10:20 – 10:40 EDIE – Elexis: DIctionary Evaluation Tool – Seung-Bin Yim, Lenka Bajčetić, Thierry Declerck and John P. McCrae
10:45 – 11:15 Coffee breaks
11:15 – 12:45 Discussion
12:45 – 13:45 Lunch break
Short description:
The focus of this workshop is to reveal novel approaches, methodologies and frameworks on profiling Linguistic Linked Data (LLD) (corpora, lexicons, ontologies, etc.) as well as to highlight tools and user interfaces that can effectively assist different use cases for profiling such data. In addition, the workshop seeks methodologies that help effective profiling in building real-world Linked Data applications leveraging linguistic data, as well as use cases that reveal success stories or aspects that have been neglected so far. The benefits of addressing Linguistic Linked Data profiling issues will not only help in understanding and exploring such data, but also provide the means to increase Linguistic Linked Data consumption, and to maintain track of the evolution of the relevant datasets.
Despite the high number of datasets published as LLD, their usage is still not exploited as they lack comprehensive metadata. Data consumers need to obtain information about datasets in a concise form to decide if they are useful for their use case or not. Data profiling techniques offer an efficient solution to this problem as they are used to generate a semantic profile that contains metadata and statistics that describe the content of the dataset. Semantic profiles are very important for different use cases, such as: (1) provision of a general overview of the data, (2) ontology / dataset integration, (3) identification of quality issues, (4) query optimization, (5) data visualization, (6) data analytics tasks, (7) schema discovery, and (8) entity summarization.
The proposed workshop seeks application-oriented papers, as well as more theoretical papers and position papers. The workshop proposes a multidisciplinary discussion on the following themes, with a focus on RDF data. Main topics but not limited to:
- Semantic profiles representation of linguistic data
- Evaluation of linguistic dataset profiling tools and algorithms
- Linguistic data summarisation
- Ontology and data quality evaluation for linguistic data
- Fusing and refining linguistics profiling results
- Scalable approaches for linguistic profiles generation
- SHACL shapes as means for profiling
- Topic profiling for linguistic data
Besides academia, the workshop targets developers and other knowledge workers. We envision the workshop as a forum for researchers and practitioners to come together and discuss common challenges and identify synergies for joint initiatives. We welcome contributions describing technical approaches, as well as those related to real use cases in using semantic profiles.
To assure a high quality of the accepted papers, a peer review process is chosen for the workshop. Each submission will be reviewed by at least 2 members of the PC. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop.
The accepted papers will be published in online proceedings (most likely in the CEUR-WS series).
Organising Committee:
- Blerina Spahiu – Università degli Studi di Milano – Bicocca, Italy
- Milan Dojchinovski – Czech Technical University in Prague / DBpedia Association, Germany
- Penny Labropoulou – Institute for Language and Speech Processing/R.C. “Athena”, Greece
- Vojtěch Svátek – Prague University of Economics and Business, Czech Republic
Program Committee:
- Albin Ahmeti (Semantic Web Company, Austria)
- Riccardo Albertoni (Institute for Applied Mathematics and Information Technologies, Consiglio Nazionale delle Ricerche, Genoa, Italy)
- Luigi Asprino (University of Bologna, Italy)
- Gabriella Casalino (University of Bari A.Moro, Italy)
- Marco Cremaschi (Università degli Studi di Milano – Bicocca, Italy)
- Theodore Dalamagas (Athena RC/Information Management Systems Institute, Greece)
- Jeremy Debattista (Top Quadrant, Malta)
- Thierry Declerck (DFKI GmbH – Saarbrücken, Germany)
- Jose Emilio Labra Gayo (University of Oviedo, Spain)
- Alfonso Guarino (Università degli Studi di Foggia, Italy)
- Jakub Klímek (Charles University, Prague, Czech Republic)
- Włodzimierz Lewoniewski (Poznań University of Economics and Business, Poland)
- Andrea Maurino (Università degli Studi di Milano – Bicocca, Italy)
- Anisa Rula (University of Brescia, Italy)
- Daniele Schicchi (Università di Palermo, Italy)
- Dimitrios Skoutas (Athena RC/Information Management Systems Institute, Greece)
- Daniele Spoladore (Istituto di Sistemi e Tecnologie Industriali Intelligenti per il Manifatturiero Avanzato (STIIMA) – CNR, Italy)
- Manuel Vimercati (Università degli Studi di Milano – Bicocca, Italy)
- Beyza Yaman (ADAPT Centre, Dublin City University, Ireland)
Important Dates:
- Paper submission: 04 April 2022 (extended)
- Workshop paper notifications sent: 21 April, 2022
- Camera-ready copies due: 13 May, 202
Registration form:
Venue: virtual / in presence workshop co-located with “NexusLinguarum Workshop days” in Jerusalem, Israel
More details on the Workshop days:
Date: 23 May
We welcome the following types of contributions:
- Short (up to 5 pages) and full (up to 10 pages) research papers
- Industry and use case presentations (up to 5 pages)
- Tool and system demonstrations should not exceed 4 pages
- Position papers (up to 4 pages)
Submissions must be in English as PDF, formatted in the style of the Springer Publications format for Lecture Notes in Computer Science (LNCS) ( Papers have to be submitted through easychair (
Each submission will be reviewed by at least 2 members of the PC. Papers will be evaluated according to their significance, originality, technical content, style, clarity and relevance to the workshop.
In addition, the workshop organization team is planning to invite selected workshop submissions to a dedicated journal special issue.