The Linguistic Computing Laboratory (LCL) is part of the Computer Science Department of the Sapienza University of Rome. The group conducts state-of-the-art research in the area of Natural Language Processing.

The group aims at devising and developing algorithms and methods in the context of machine learning, pattern matching and recognition and natural language processing to solve problems related to automatic text understanding, construction, learning and population of ontologies, semantic text indexing and classification, query expansion, question answering, etc.

Research fields include:
  • Multilingual Word Sense Disambiguation and Induction
  • Multilingual Entity Linking
  • Broad and Deep Learning
  • Distributional semantic similarity
  • Ontology Learning and Population
  • Large-Scale Knowledge Acquisition
  • Semantic and Statistical Machine Translation
  • Semantic Information Retrieval
  • Social Network Analysis and Mining

POSTDOC POSITION IN MULTILINGUAL NLP (SEMANTIC PARSING)

Sapienza University of Rome, Italy

Department of Computer Science (first-ranked Italian Department of Excellence)

One postdoc position (1+2 years) in Natural Language Processing is open in the Linguistic Computing Laboratory (http://lcl.uniroma1.it), Department of Computer Science of the Sapienza University of Rome.

The position is part of ELEXIS (2018-2022), a new European infrastructure for electronic lexicography, and MOUSSE (2017-2022), a new 5-year ERC Consolidar Grant funded by the European Research Council (ERC) and headed by prof. Roberto Navigli, following the success of his MultiJEDI ERC Starting Grant (http://multijedi.org). The successful candidate will participate in a frontier research project aimed at designing and investigating novel neural network architectures for multilingual disambiguation and semantic parsing and will work in the vibrant environment of a leading and highly-active international research team comprising 3 faculty members, 1 post-doc and 6 Ph.D. students. The group has published dozens of papers in top-tier venues in the field of computational linguistics and artificial intelligence.

Collaborators in the research group also have the option to interact with Babelscape, a Sapienza startup company founded by prof. Navigli which brings research in multilingual Natural Language Processing to the market and makes research projects, such as the award-winning BabelNet, sustainable in the long term. Babelscape is currently working for key players in different fields, including multinational companies, and national and international public bodies. Around 20 developers and researchers currently work in the company.

REQUIREMENTS/QUALIFICATIONS

The successful candidate will work actively on novel directions in deep learning and neural networks for multilingual disambiguation and semantic parsing in arbitrary languages, and will co-supervise Ph.D. students in the group on the topic. The candidate will innovate the field while at the same time taking advantage of successful approaches and resources for multilingual lexical semantics such as BabelNet (winner of the 2017 Artificial Intelligence prominent paper award and featured in The Guardian and Time magazine), the Multilingual Wikipedia Bitaxonomy, sense-based embedded representations (SensEmbed), and sense-based lexical, semantic and embedded vectors (NASARI), state-of-the-art neural Word Sense Disambiguation and train-o-matic.

The candidate is expected to have:

  • a Ph.D. or equivalent in Computer Science, Computational Linguistics/NLP, Mathematics or related fields.
  • Good programming skills in Python.
  • Fluent English. Knowledge of other languages (especially Asian languages) is more than welcome. Knowledge of Italian is NOT a requirement.
  • Knowledge of current neural network models, especially recurrent neural networks such as LSTMs, and tools for neural networks (e.g. Tensorflow, Keras, PyTorch, etc.).
  • Publications in top-tier venues in the field of Computational Linguistics.
  • Experience in Ph.D. student supervision is a plus.

INFORMATION

  • Application deadline: early April 2018
  • Interviews will take place via Skype in the second part of April 2018
  • Starting date: as early as possible, ideally on May 2nd, 2018
  • Duration: 1+3 years
  • Salary: 32000 euros per annum. Note that this type of research contract is exempt from taxes, while including social insurance (32000 euros per annum corresponds to around 2000 euros net per month).

HOW TO APPLY

Informal enquiries can be sent by email to prof. Roberto Navigli (navigli@di.uniroma1.it). The application requires a brief motivation letter, a detailed CV and contact details for up to three references. Please include the job reference [LCL-POSTDOC-2018] in the subject line.

ABOUT THE SAPIENZA UNIVERSITY OF ROME

The Sapienza University of Rome is a seven-century-old university in the heart of Rome. It is one of the largest universities in Europe, with around 110,000 students. Its Faculty of Information Engineering, Informatics and Statistics (that includes the Department of Computer Science) is one of the youngest, most energetic and scientifically active environments at Sapienza. Sapenza has a portfolio of 30 ERC grants, 5 of which in the Department of Computer Science. Sapienza is an equal opportunity employer.

ABOUT THE SAPIENZA COMPUTER SCIENCE DEPARTMENT

The Department of Computer Science is a modern and well-equipped research institution with a top-class faculty and a strong Ph.D. program. It is a winner of the “department of excellence” national grant in computer science (ranked first among hundreds of candidates, among which only 3 departments in the field obtained the grant), with 6.5 million euros to be spent in infrastructures, positions and grants in the upcoming 5 years. The Department comprises 44 faculty members (with 5 ERC grants), 15 postdocs and around 30 Ph.D. students. The successful candidate will be based in Rome, one of the most beautiful cities in the world. The Department is situated just across the road from the main University Campus, close to a student area with lots of cafes, bars, restaurants, and only a short bus ride from the city centre. Candidates should not be afraid of the language barrier, as Italians are in general very friendly. English is spoken widely throughout the Department.

read more

BabelNet in The Guardian!

Last Friday BabelNet was mentioned as a "bigger dream" for autonomous machine reading in The Guardian.

Release of BabelNet 4

We are proud to announce the release of a new major version of BabelNet and its API, developed jointly by the Linguistic Computing Laboratory of the Sapienza University of Rome under the supervision of prof. Roberto Navigli and Babelscape, a Sapienza startup company providing innovative solutions for multilingual NLP.

BabelNet -- winner of the prominent paper award 2017 from the Artificial Intelligence Journal and the META prize 2015, and covered in media such as The Guardian and Time magazine -- is today’s most far-reaching multilingual resource which, according to need, can be used as an encyclopedic dictionary, or a semantic network or a huge knowledge base.


BabelNet was created by means of the seamless integration and interlinking of the largest multilingual Web encyclopedia - i.e., Wikipedia - with the most popular computational lexicon of English - i.e., WordNet, and other lexical resources such as Wiktionary, OmegaWiki, Wikidata, Wikipedia infoboxes, dozens of wordnets, Wikiquote, FrameNet, VerbNet, Microsoft Terminology, GeoNames, and ImageNet. BabelNet provides multilingual synsets, i.e., concepts and named entities lexicalized in many languages, and connected with large amounts of semantic relations.


Version 4.0 comes with the following features:

  • 284 languages now covered
  • Wikipedia, Wiktionary, Wikidata and OmegaWiki have been updated thanks to BabelNet live, a continuously-growing resource with daily updates from all the sources that go to make it up
  • Better sense inventory thanks to the manual validation of thousands of mappings
  • All existing wordnets updated
  • New wordnets integrated for Gaelic, Portuguese and Korean
  • Improved treatment of Chinese
  • 2 million new multilingual synsets (from 14 in v3.7 to 16 million synsets in v4)
  • 832 million senses (was 745 million Babel senses in v3.7, increasing language coverage considerably).
  • Improved management of open wordnets that are now stored with their individual licenses
  • Improved version of the Java and HTTP RESTful API (http://babelnet.org/download). The Java API comes with reengineered interfaces and classes, additional methods for Java 8 and a Java 9-ready packaging. A brand-new Python API is under development.

More statistics are available at: http://babelnet.org/stats.


We are organizing a two-day summer school and hackathon to be held in Rome or Venice (location to be decided), with tutorials, interactive sessions and presentations. We are gathering interest and preferences: if you are potentially interested, just fill in the form!


Kind regards,
The BabelNet group

AIJ Prominent Paper Award 2017

BabelNet won the Prominent Paper Award 2017 from Artificial Intelligence, the most prestigious journal in the field of AI. This year the award selected the best article published in 2009-2016. The paper, authored by Roberto Navigli and Simone Paolo Ponzetto, presents the algorithmic techniques for creating and evaluating the BabelNet multilingual semantic network.

Reference paper:

Pre-Ph.D.+Ph.D. Position in Multilingual NLP (Semantic Parsing)

One pre-Ph.D. research position (with the possibility of starting a Ph.D. in 2018 on the same salary with a privileged track) in Natural Language Processing is open.

The position is part of a new 5-year ERC Consolidar Grant funded by the European Research Council (ERC) and headed by prof. Roberto Navigli, following the success of his MultiJEDI ERC Starting Grant (http://multijedi.org). The successful candidate will participate in a frontier research project aimed at designing and investigating novel neural network architectures for multilingual disambiguation and semantic parsing and will work in the vibrant environment of a leading and highly-active international research team comprising 3 faculty members, 1 post-doc and 6 Ph.D. students. The group has published dozens of papers in top-tier venues in the field of computational linguistics and artificial intelligence.

Interested students and collaborators in the research group have the option to interact with Babelscape, a Sapienza startup company founded by prof. Navigli which brings research in multilingual Natural Language Processing to the market and makes research projects, such as the award-winning BabelNet, sustainable in the long term. Babelscape is currently working for key players in different fields, including multinational companies, and national and international public bodies. Around 15 developers and researchers are working in the company.

REQUIREMENTS/QUALIFICATIONS

The successful candidate will work actively on new directions in deep learning and neural networks for multilingual lexical semantic tasks such as Word Sense Disambiguation, Entity Linking and semantic parsing in arbitrary languages, starting from successful approaches and resources for multilingual lexical semantics such as BabelNet (winner of the 2017 Artificial Intelligence prominent paper award), the Multilingual Wikipedia Bitaxonomy, SensEmbed, NASARI explicit and embedded vectors, state-of-the-art neural Word Sense Disambiguation and train-o-matic.

The candidate is expected to have:
  • a M.Sc. or equivalent in Computer Science, Computational Linguistics/NLP, Mathematics or related fields.
  • Good programming skills in Python and/or Java (there is the option to attend a Python/Java course at Sapienza on the first contract year).
  • Fluent English. Knowledge of other languages (especially Asian languages) is more than welcome. Knowledge of Italian is NOT a requirement.
  • Knowledge of current neural network models, especially recurrent neural networks such as LSTMs, and tools for neural networks (e.g. Tensorflow, Keras, Torch, Theano, etc.) is a plus.
  • Publications in Computational Linguistics, participation in summer schools and other experiences are a plus.
INFORMATION
  • Application deadline: early October 2017
  • Interviews will take place via Skype around the end of October/beginning of November
  • Starting date: as early as possible, ideally on December 1st, 2017
  • Duration: 1+3 years
  • Salary: 25000 euros per annum. Note that this type of research contract is exempt from taxes, while including social insurance (25000 euros per annum corresponds to around 1560 euros net per month).

HOW TO APPLY

Information can be requested by email to Roberto Navigli (navigli@di.uniroma1.it). The application requires a brief motivation letter, a detailed CV and contact details for up to three references. Please include the job reference [LCL1-2017] in the subject line.

Candidates attending EMNLP this week are welcome for an informal chat on the position.

ERC Consolidator Grant!

Prof. Roberto Navigli has been awarded a prestigious ERC Consolidator Grant in Computer Science and Informatics. Stay tuned for important updates and open positions!

NASARI and MultiWiBi in Artificial Intelligence



Two new Artificial Intelligence Journal articles from LCL: NASARI and MultiWiBi:

José Camacho Collados, Mohammad Taher Pilehvar and Roberto Navigli. Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. Artificial Intelligence (2016), volume 240, pages 36-64.

Tiziano Flati, Daniele Vannella, Tommaso Pasini and Roberto Navigli. MultiWiBi: The multilingual Wikipedia bitaxonomy project. Artificial Intelligence (2016), volume 241, pages 66-102.

ACL Tutorial 2016: Semantic Representations of Word Senses and Concepts

The LCL members José Camacho Collados, Ignacio Iacobacci, Roberto Navigli, and Mohammad Taher Pilehvar (currently at the University of Cambridge) will be presenting a tutorial on “Semantic Representations of Word Senses and Concepts” in Berlin at the ACL conference (August 7th, 2016).

José Camacho Collados received a prestigious Google PhD Fellowship!



We are proud to announce that José Camacho Collados has been awarded with the 2016 Google Fellowship in Natural Language Processing!

BabelNet 3.7 is now out!



We are happy to announce the release of a new version of BabelNet.

BabelNet (http://babelnet.org) is the largest multilingual encyclopedic dictionary and semantic network created by means of the seamless integration of the largest multilingual Web encyclopedia - i.e., Wikipedia - with the most popular computational lexicon of English - i.e., WordNet, and other lexical resources such as Wiktionary, OmegaWiki, Wikidata, Open Multilingual WordNet, Wikiquote, VerbNet, Microsoft Terminology, GeoNames, WoNeF, ImageNet, ItalWordNet, Open Dutch WordNet and FrameNet. The integration is performed via an automatic linking algorithm and by filling in lexical gaps with the aid of Machine Translation. The result is an encyclopedic dictionary that provides Babel synsets, i.e., concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations.

Version 3.7 comes with the following features:
  • New resource integrated: FrameNet.
  • More than 2500 Babel synsets identified as key concepts.
  • Mappings with several versions of WordNet now integrated (from 1.6 to 3.0).
  • More than 2.6 million Babel synsets labeled with domains (were 1,558,806 in v3.6).

More statistics are available at:http://babelnet.org/stats

BabelNet was part of the MultiJEDI project originally funded by the European Research Council and headed by Prof. Roberto Navigli at the Linguistic Computing Laboratory of the Sapienza University of Rome. BabelNet is now a self-sustained project. It is, and always will be, free for research purposes, including download. Babelscape, a Sapienza startup company, is BabelNet's commercial support arm, thanks to which the project will be continued and improved over time.

Enjoy!

BabelNet in TIME magazine!

BabelNet features prominently in TIME magazine, in the inspiring article "Redefining the modern dictionary" by Katy Steinmetz. The article talks about the new age of innovative and up-to-date lexical knowledge resources available on the Web, and describes in some detail how BabelNet is playing a leading role in this 21st century scenario!

BabelNet 3.6 is now out!



As the final output of the "MultiJEDI" Starting Grant (http://multijedi.org), funded by the European Research Council and headed by Prof. Roberto Navigli, the Linguistic Computing Laboratory of the Sapienza University of Rome is proud to announce the release of BabelNet 3.6.

Version 3.6 comes with the following features:
  • New resources integrated: ItalWordNet, Open Dutch WordNet.
  • 625 million new senses (now totalizing 745 million Babel senses, increasing language coverage considerably).
  • 6.4 million surface forms for Babel synsets.
  • 3.5 million YAGO external links.
  • Improved version of the Java and HTTP RESTful API (http://babelnet.org/download)
  • For fans of offline processing with non-commercial purposes: downloadable offline indices starting shortly!

More statistics are available at: http://babelnet.org/stats

Enjoy!

The Luxembourg BabelNet Workshop

2-3 March, 2016, Luxembourg
http://babelnet.org/lux

Schuman Building of the European Parliament, Hemicycle. 2929 Luxembourg

Organized by:
EU Publications Office, European Commission, European Parliament

We are proud to announce the Luxembourg BabelNet workshop. This event is a technical workshop on BabelNet, the largest multilingual encyclopedic dictionary and semantic network -- now also a huge knowledge base -- covering 14 million concepts and named entities in 272 languages. The workshop will take place over two days. The first day is a technical guided tour, including industrial applications. The second day consists of four case studies of resource mapping to BabelNet.

The workshop is open to all comers. It will be held in English and attendance is free, up to the room capacity.

This is an IT technical workshop; the ideal background of participants is computer science and natural language processing. However, participants from other backgrounds with an interest in the IT aspects of their specialty (like authors, translators and publishers) should also benefit, though they must be aware of the technical IT nature of the workshop. Regardless of their background, participants will gain a deep understanding of BabelNet: at least being able to properly use even the most advanced functionalities, such as traversing the network, multilingual disambiguation and high-performance mapping; perhaps capable of contributing at the conceptual level; at the higher end, contribute and getting involved in the development.

BabelNet 3.5 is now out!

As an output of the "MultiJEDI" Starting Grant, funded by the European Research Council and headed by Prof. Roberto Navigli, the Linguistic Computing Laboratory http://lcl.uniroma1.it of the Sapienza University of Rome is proud to announce the release of BabelNet 3.5.

BabelNet (http://babelnet.org) is a very large multilingual encyclopedic dictionary and semantic network created by means of the seamless integration of the largest multilingual Web encyclopedia - i.e., Wikipedia - with the most popular computational lexicon of English - i.e., WordNet, and other lexical resources such as Wiktionary, OmegaWiki, Wikidata, Open Multilingual WordNet, Wikiquote, VerbNet, Microsoft Terminology, GeoNames, WoNeF, and ImageNet. The integration is performed via an automatic linking algorithm and by filling in lexical gaps with the aid of Machine Translation. The result is an encyclopedic dictionary that provides Babel synsets, i.e., concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations.

Version 3.5 comes with the following features:
More statistics are available at: http://babelnet.org/stats

Enjoy!
BabelNet received the prestigious META prize 2015!!!

BabelNet - for groundbreaking work in overcoming language barriers through a multilingual lexicalised semantic network and ontology making use of heterogeneous data sources. The resulting encyclopedic dictionary provides concepts and named entities lexicalised in many languages, enriched with semantic relations.
Babelfy 1.0 is now out!

Babelfy is a joint, unified approach to Word Sense Disambiguation and Entity Linking in language of choice. The system is based on a loose identification of candidate meanings coupled with a densest subgraph heuristic which selects high-coherence semantic interpretations. Its performance on standard word sense disambiguation and entity linking tasks is on a par with, or surpasses, those of language- and task-specific state-of-the-art systems.

New features in Babelfy v1.0:
  • 271 languages covered plus a novel language-agnostic setting!

  • Available via easy-to-use Java and HTTP RESTful APIs.

  • The input context can be either a text or a bag of words where you can mix up languages!

  • Plenty of tunable parameters for the disambiguation procedure such as setting your own threshold, enabling multiple scored annotations of the same fragment, restricting the annotations to WordNet, Wikipedia or BabelNet, input the offsets that you want to be linked, provide pre-annotated tokens as disambiguation context, disable/enable the most common sense heuristic, multi-word expressions and the densest subgraph heuristic.

  • Three different scores are now output: the disambiguation score, a coherence score and a global relevance score.

  • Disambiguation and entity linking is performed using BabelNet, thereby implicitly annotating according to several different inventories such as WordNet, Wikipedia, Wiktionary, OmegaWiki, etc.

BabelNet 3.0 is now out!

BabelNet (http://babelnet.org) is a very large multilingual encyclopedic dictionary and semantic network created by means of the seamless integration of the largest multilingual Web encyclopedia - i.e., Wikipedia - with the most popular computational lexicon of English - i.e., WordNet, and other lexical resources such as Wiktionary, OmegaWiki, Wikidata, and the Open Multilingual WordNet. The integration is performed via an automatic linking algorithm and by filling in lexical gaps with the aid of Machine Translation. The result is an encyclopedic dictionary that provides Babel synsets, i.e., concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations.

Version 3.0 comes with the following features:
  • 271 languages now covered!

  • New Java and HTTP RESTful API

  • Fully taxonomized thanks to the seamless integration of our Wikipedia Bitaxonomy

  • 13.7 million meanings (Babel synsets)

  • 40.3 million textual definitions

Tutorials in 2014

We are presenting tutorials at four different conferences:
Babelfy released!

We are happy to announce the release of Babelfy which is a unified approach to multilingual Word Sense Disambiguation and Entity Linking.
Babelfy website
ERC Starting Grant!

Prof. Roberto Navigli has been awarded a prestigious ERC starting grant in computer science and informatics (2011-2016). The project, called MultiJEDI, will focus on multilingual semantic processing. Many positions are open on the project.