The Linguistic Computing Laboratory (LCL) is part of the Computer Science Department of the Sapienza University of Rome. The group conducts state-of-the-art research in the area of Natural Language Processing.

The group aims at devising and developing algorithms and methods in the context of machine learning, pattern matching and recognition and natural language processing to solve problems related to automatic text understanding, construction, learning and population of ontologies, semantic text indexing and classification, query expansion, question answering, etc.

Research fields include:
  • Multilingual Word Sense Disambiguation and Induction
  • Multilingual Entity Linking
  • Broad and Deep Learning
  • Distributional semantic similarity
  • Ontology Learning and Population
  • Large-Scale Knowledge Acquisition
  • Semantic and Statistical Machine Translation
  • Semantic Information Retrieval
  • Social Network Analysis and Mining
ACL Tutorial 2016: Semantic Representations of Word Senses and Concepts

The LCL members José Camacho Collados, Ignacio Iacobacci, Roberto Navigli, and Mohammad Taher Pilehvar (currently at the University of Cambridge) will be presenting a tutorial on “Semantic Representations of Word Senses and Concepts” in Berlin at the ACL conference (August 7th, 2016).
José Camacho Collados received a prestigious Google PhD Fellowship!

We are proud to announce that José Camacho Collados has been awarded with the 2016 Google Fellowship in Natural Language Processing!
BabelNet 3.6 is now out!

As the final output of the "MultiJEDI" Starting Grant (http://multijedi.org), funded by the European Research Council and headed by Prof. Roberto Navigli, the Linguistic Computing Laboratory of the Sapienza University of Rome is proud to announce the release of BabelNet 3.6.

BabelNet (http://babelnet.org) is the largest multilingual encyclopedic dictionary and semantic network created by means of the seamless integration of the largest multilingual Web encyclopedia - i.e., Wikipedia - with the most popular computational lexicon of English - i.e., WordNet, and other lexical resources such as Wiktionary, OmegaWiki, Wikidata, Open Multilingual WordNet, Wikiquote, VerbNet, Microsoft Terminology, GeoNames, WoNeF, ImageNet, ItalWordNet and Open Dutch WordNet. The integration is performed via an automatic linking algorithm and by filling in lexical gaps with the aid of Machine Translation. The result is an encyclopedic dictionary that provides Babel synsets, i.e., concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations.

Version 3.6 comes with the following features:
  • New resources integrated: ItalWordNet, Open Dutch WordNet.
  • 625 million new senses (now totalizing 745 million Babel senses, increasing language coverage considerably).
  • 6.4 million surface forms for Babel synsets.
  • 3.5 million YAGO external links.
  • Improved version of the Java and HTTP RESTful API (http://babelnet.org/download)
  • For fans of offline processing with non-commercial purposes: downloadable offline indices starting shortly!

More statistics are available at: http://babelnet.org/stats

Enjoy!
The Luxembourg BabelNet Workshop

2-3 March, 2016, Luxembourg
http://babelnet.org/lux

Schuman Building of the European Parliament, Hemicycle. 2929 Luxembourg

Organized by:
EU Publications Office, European Commission, European Parliament

We are proud to announce the Luxembourg BabelNet workshop. This event is a technical workshop on BabelNet, the largest multilingual encyclopedic dictionary and semantic network -- now also a huge knowledge base -- covering 14 million concepts and named entities in 272 languages. The workshop will take place over two days. The first day is a technical guided tour, including industrial applications. The second day consists of four case studies of resource mapping to BabelNet.

The workshop is open to all comers. It will be held in English and attendance is free, up to the room capacity.

This is an IT technical workshop; the ideal background of participants is computer science and natural language processing. However, participants from other backgrounds with an interest in the IT aspects of their specialty (like authors, translators and publishers) should also benefit, though they must be aware of the technical IT nature of the workshop. Regardless of their background, participants will gain a deep understanding of BabelNet: at least being able to properly use even the most advanced functionalities, such as traversing the network, multilingual disambiguation and high-performance mapping; perhaps capable of contributing at the conceptual level; at the higher end, contribute and getting involved in the development.
BabelNet 3.5 is now out!

As an output of the "MultiJEDI" Starting Grant, funded by the European Research Council and headed by Prof. Roberto Navigli, the Linguistic Computing Laboratory http://lcl.uniroma1.it of the Sapienza University of Rome is proud to announce the release of BabelNet 3.5.

BabelNet (http://babelnet.org) is a very large multilingual encyclopedic dictionary and semantic network created by means of the seamless integration of the largest multilingual Web encyclopedia - i.e., Wikipedia - with the most popular computational lexicon of English - i.e., WordNet, and other lexical resources such as Wiktionary, OmegaWiki, Wikidata, Open Multilingual WordNet, Wikiquote, VerbNet, Microsoft Terminology, GeoNames, WoNeF, and ImageNet. The integration is performed via an automatic linking algorithm and by filling in lexical gaps with the aid of Machine Translation. The result is an encyclopedic dictionary that provides Babel synsets, i.e., concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations.

Version 3.5 comes with the following features:
More statistics are available at: http://babelnet.org/stats

Enjoy!
BabelNet received the prestigious META prize 2015!!!

BabelNet - for groundbreaking work in overcoming language barriers through a multilingual lexicalised semantic network and ontology making use of heterogeneous data sources. The resulting encyclopedic dictionary provides concepts and named entities lexicalised in many languages, enriched with semantic relations.
Babelfy 1.0 is now out!

Babelfy is a joint, unified approach to Word Sense Disambiguation and Entity Linking in language of choice. The system is based on a loose identification of candidate meanings coupled with a densest subgraph heuristic which selects high-coherence semantic interpretations. Its performance on standard word sense disambiguation and entity linking tasks is on a par with, or surpasses, those of language- and task-specific state-of-the-art systems.

New features in Babelfy v1.0:
  • 271 languages covered plus a novel language-agnostic setting!

  • Available via easy-to-use Java and HTTP RESTful APIs.

  • The input context can be either a text or a bag of words where you can mix up languages!

  • Plenty of tunable parameters for the disambiguation procedure such as setting your own threshold, enabling multiple scored annotations of the same fragment, restricting the annotations to WordNet, Wikipedia or BabelNet, input the offsets that you want to be linked, provide pre-annotated tokens as disambiguation context, disable/enable the most common sense heuristic, multi-word expressions and the densest subgraph heuristic.

  • Three different scores are now output: the disambiguation score, a coherence score and a global relevance score.

  • Disambiguation and entity linking is performed using BabelNet, thereby implicitly annotating according to several different inventories such as WordNet, Wikipedia, Wiktionary, OmegaWiki, etc.

BabelNet 3.0 is now out!

BabelNet (http://babelnet.org) is a very large multilingual encyclopedic dictionary and semantic network created by means of the seamless integration of the largest multilingual Web encyclopedia - i.e., Wikipedia - with the most popular computational lexicon of English - i.e., WordNet, and other lexical resources such as Wiktionary, OmegaWiki, Wikidata, and the Open Multilingual WordNet. The integration is performed via an automatic linking algorithm and by filling in lexical gaps with the aid of Machine Translation. The result is an encyclopedic dictionary that provides Babel synsets, i.e., concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations.

Version 3.0 comes with the following features:
  • 271 languages now covered!

  • New Java and HTTP RESTful API

  • Fully taxonomized thanks to the seamless integration of our Wikipedia Bitaxonomy

  • 13.7 million meanings (Babel synsets)

  • 40.3 million textual definitions

Tutorials in 2014

We are presenting tutorials at four different conferences:
Babelfy released!

We are happy to announce the release of Babelfy which is a unified approach to multilingual Word Sense Disambiguation and Entity Linking.
Babelfy website
ERC Starting Grant!

Prof. Roberto Navigli has been awarded a prestigious ERC starting grant in computer science and informatics (2011-2016). The project, called MultiJEDI, will focus on multilingual semantic processing. Many positions are open on the project.