2nd Summer Datathon on Linguistic Linked Open Data (SD-LLOD-17)

The 2nd Summer Datathon on Linguistic Linked Open Data (SD-LLOD-17) will be held from June 26th to 30th 2017 at Residencia Lucas Olazábal of Universidad Politécnica de Madrid, Cercedilla, Madrid.

The SD-LLOD datathon has the main goal of giving people from industry and academia practical  knowledge in the field of Linked Data applied to Linguistics. The final aim is to allow participants to migrate their own (or other’s) linguistic data and publish them as Linked Data on the Web. This datathon series is unique in its topic worldwide and continues from the success of the event two years ago (see http://datathon.lider-project.eu/). This edition is supported by the ReTeLe Spanish excellence network funded by the Spanish Ministry of Economy, Industry, and Competitiveness and by the Early Career Research Group "Linked Open Dictionaries (LiODi)" funded by the German Federal Ministry of Education and Research (BMBF).

Organized by

Partner with

Sponsored by



During the datathon, participants will:

  • Generate and publish their own linguistic linked data from some existing data source.
  • Apply Linked Data principles and Semantic Web technologies (Ontologies, RDF, Linked Data) into the field of language resources.
  • Using the principal models used for representing Linguistic Linked Data, in particular Ontolex-Lemon and NIF
  • Perform Multilingual Word Sense Disambiguation and Entity Linking for the Web of Data.
  • Learn about potential benefits and applications of linguistic linked data for specific use cases.


During the datathon, seminars will be organised to cover topics such as:

  • Ontologies and Linked Data
  • The Lexicon Model for Ontologies (Ontolex-Lemon)
  • The NIF format for Integrating NLP with Linked Data and RDF
  • Guidelines for RDF generation of Language Resources
  • Methodologies for Linked Data publication of Language Resources
  • Multilingual Word Sense Disambiguation and Entity Linking with BabelNet
  • Use and Applications of Linguistic Linked Data
  • Metadata and Licenses for Linguistic Linked Data

With the objective of avoiding passive learning, the program of the summer datathon will contain three types of sessions:

      1. Seminars to show novel aspects and discuss selected topics
      2. Practical sessions to introduce the basic foundations of each topic, methods, and technologies and where participants will perform different tasks using the methods and technologies presented
      3. Hacking sessions where participants will follow the whole process of generating and publishing Linguistic Linked Data with some existing data set

Participants will be invited to propose a “miniproject” related to the topic and to bring to the datathon some dataset of linguistic data produced by their organizations in order to work on it during the hacking sessions and transform it into linked data. Participants who cannot provide their own linguistic dataset can join another’s miniproject or some of the ones proposed by the organisers. There will be an award to the best miniproject.

Participants should bring their own laptops to follow the hacking sessions, but they will be provided with digital copies of all the material used during the course and will have assistance for installing all the required software.


Opening session

Asun Gómez-Pérez, Vice-Rector for Research, Innovation and Doctoral Studies and Full Professor at Universidad Politécnica de Madrid (UPM), will inaugurate the datathon with a talk titled “Linked data for language technologies”

Invited talks

Elena González-Blanco “Poetry modelling through linked open data: a Digital Humanities approach”. Monday 26th at 10:00
Ilan Kernerman “Linked data lexicography”. Tuesday 27th at 9:30
Penny Labropoulou “Linked metadata for linked data”. Wednesday 28th at 9:30
Angus Roberts “Linking data, language, and domain knowledge”. Thursday 29th at 9:30


Julia Bosque-Gil “Linked data and dictionaries”. Tuesday 27th at 10:30
Bettina Klimek “The MMoOn model” Wednesday 28th at 10:30
Víctor Rodriguez-Doncel "Rights and licenses for language resources“. Thursday 29th at 10:30

Practical sessions

The practical sessions will consist on a short common theoretical introduction (in a plenary room) and then the audience will be divided into two groups for the hands-on session, each one led by a different lecturer and run in parallel in two different classrooms.
Monday 26th
         “Introduction to Linked Data. Multilingual LD generation and publishing” Jorge Gracia and Julia

         “lemon-ontolex” John McCrae and Jorge Gracia

Tuesday 27th
         “NIF” Markus Ackermann and Bettina Klimek
         “Corpora” Christian Chiarcos, Christian Fäth and Max Ionov

Wednesday 28th
         “LLOD utilities” Andrejs Ābele and Mariano Rico


Jorge Gracia

Ontology Engineering Group, Universidad Politécnica de Madrid

I am a postdoctoral researcher at Ontology Engineering Group (OEG), at Artificial Intelligence Department, Universidad Politécnica de Madrid (UPM), Spain. I head the area of Data Driven Language Technologies at OEG. My main research interests are Semantic Web, Ontology Matching, Multilingual Web of Data, Query Interpretation, and Linguistic Linked Data.

For further information, please visit my personal website

John Philip McCrae

Insight Centre for Data Analytics, NUI Galway

I am a lecturer above-the-bar at the Insight Centre for Data Analytics at the National University of Ireland Galway. I am currently working with Paul Buitelaar in the Unit for Natural Language Processing. My main research has focused around the development of linguistic linked open data and in particular the development of models for the representation of lexical resources, by means of the lemon and OntoLex models.

For further information, please visit my personal website

Christian Chiarcos

ACoLi - Goethe University Frankfurt, Germany

As Professor of Computer Science at Goethe University Frankfurt, Germany, I am heading the Applied Computational Linguistics (ACoLi) lab since 2013, and the early career group "Linked Open Dictionaries (LiODi)". My research focuses on semantic technologies, including computational semantics as well as the innovative application of Semantic Web standards to problems in NLP and Digital Hmanities.

For further information, please visit my personal website

Local Organisers

Elena Montiel Ponsoda

Ontology Engineering Group, Universidad Politécnica de Madrid

Administrative Director
For further information, please visit my personal website

José Ángel Ramos Gargantilla

Ontology Engineering Group, Universidad Politécnica de Madrid

Datathon Secretary
For further information, please visit my personal website

How to Apply

We welcome participants from anywhere in the world and coming from industry or academia. Some basic acquaintance with software development and Web technologies is required. Participants are expected to participate fully in the activities of the datathon until its conclusion.

Registration is now closed. The registration fee will be 490€ and will include accommodation (individual room at the venue), meals, and social events.

If you want to propose a topic for a mini-project in the datathon (e.g., a language resource to be converted into linked data, a LLOD dataset to be linked to other resources, a use case description that exploits the LLOD cloud, ...) or want to report on some recent research related to the topics of the datathon, you can write a short description of your ideas (less than 1000 words), that you can either submit during the registration process or send to the organisers via email by 4th April. Some selected mini-project proposals and abstracts will be presented as posters during the event.

Important Dates

Registration opens: February, 1st 2017
Registration closes: April, 4th 2017
March, 24th 2017

Notification: April, 12th 2017
March, 31th 2017

Payment until June, 16th 2017
April, 21th 2017

Datathon: June, 26th to 30th 2017

Invited speakers


Other collaborators


Here you can find a brief report summarising how the previous edition of the datathon (2015) went

Here there are some pictures of SD-LLOD-15 from Twitter


The Summer Datathon on Linguistic Linked Open Data (SD-LLOD-17) will be held at Cercedilla (Madrid), which is a small village in the mountains near Madrid.

The event will take place at the Residencia Lucas Olazábal of Universidad Politécnica de Madrid, which is located in Cercedilla, in the forest of the Sierra de Guadarrama, in a place known as Las Dehesas de Cercedilla, which is 50km from Madrid, and 15km from Navacerrada (Directions).

Going to Cercedilla from Madrid-Barajas airport

The nearest airport is Adolfo Suárez Madrid-Barajas. Once you are in Madrid, the best option to go to Cercedilla is by train. The estimated time of the whole trip (from Adolfo Suárez Madrid-Barajas airport to Cercedilla Residence) is around 2 hours.

Taking the train to Cecerdilla

If you reach to Terminals 1, 2 or 3 in Adolfo Suárez Madrid-Barajas Airport
Go to the metro station and take take line 8 (the pink line on the map) to Nuevos Ministerios statin. Leave the metro station and go to the train station (Cercanías Renfe). Then take line C8B to Cercedilla.

If you reach to Terminal 4 in Adolfo Suárez Madrid-Barajas Airport
Go to the train station and take line C1 to Chamartín. Then take line C8B to Cercedilla.

Getting to the Residence From Cercedilla

The train depot in Cercedilla is 4 km away from the Residence. You should take a taxi to get to the residence. The taxi stop is in front of train station.
If you arrive late and there is nobody waiting for you at the Cecerdilla train station, please phone the following number:
(+34) 91 852 15 68

Going by car to the Residence

From Madrid, motorway A-6 until the exit El Escorial, Guadarrama. Take direction Guadarrama until crossroads with the old N-6 (it is a crossroads with traffic lights where you can see El Piquio Hotel). Turn left and cross Guadarrama village until Cercedilla indication (it is a road on the right, next to a headquarters). Go straight on road until Cercedilla. When you pass the Cercedilla’s train station, go straight on next crossroads Las Dehesas – La Fuenfría direction. When you arrive to forest information turn right following Residencia Lucas Olazábal UPM direction.

Residencia Lucas OlazC!bal