Publications

  • AAAI 2020

    B. Scarlini, T. Pasini, R. Navigli

    SensEmBERT: Context-Enhanced Sense Embeddings for Multilingual Word Sense Disambiguation

    Proc. of the 34th AAAI Conference on Artificial Intelligence (AAAI 2020), New York, USA, 7-12th February, 2020.

    AAAI 2020

    BibTex

    @inproceedings{scarlini2020sensembert,
        title={SENSEMBERT: Context-Enhanced Sense Embeddings for Multilingual Word Sense Disambiguation},
        author={Scarlini, Bianca and Pasini, Tommaso and Navigli, Roberto}
    }
                    
  • AAAI 2020

    C. Lacerra, M. Bevilacqua, T. Pasini, R. Navigli

    CSI: A Coarse Sense Inventory for 85% Word Sense Disambiguation

    Proc. of the 34th AAAI Conference on Artificial Intelligence (AAAI 2020), New York, USA, 7-12th February, 2020.

    AAAI 2020

    BibTex

    @inproceedings{lacerra2020csi,
        title = {CSI: A coarse sense inventory for 85\% word sense disambiguation},
        author = {Lacerra, Caterina and Bevilacqua, Michele and Pasini, Tommaso and Navigli, Roberto},
        booktitle = {Proc. of AAAI},
        year = {2020}
    }
                    
  • ACL 2020

    M. Bevilacqua, R. Navigli

    Breaking Through the 80% Glass Ceiling: Raising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information

    Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

    ACL 2020

    BibTex

    @inproceedings{bevilacqua-navigli-2020-breaking,
        title = "Breaking Through the 80{\%} Glass Ceiling: {R}aising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information",
        author = "Bevilacqua, Michele  and
          Navigli, Roberto",
        booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
        month = jul,
        year = "2020",
        address = "Online",
        publisher = "Association for Computational Linguistics",
        url = "https://www.aclweb.org/anthology/2020.acl-main.255",
        pages = "2854--2864",
        abstract = "Neural architectures are the current state of the art in Word Sense Disambiguation (WSD). However, they make limited use of the vast amount of relational information encoded in Lexical Knowledge Bases (LKB). We present Enhanced WSD Integrating Synset Embeddings and Relations (EWISER), a neural supervised architecture that is able to tap into this wealth of knowledge by embedding information from the LKB graph within the neural architecture, and to exploit pretrained synset embeddings, enabling the network to predict synsets that are not in the training set. As a result, we set a new state of the art on almost all the evaluation settings considered, also breaking through, for the first time, the 80{\%} ceiling on the concatenation of all the standard all-words English WSD evaluation benchmarks. On multilingual all-words WSD, we report state-of-the-art results by training on nothing but English.",
    }
                    
  • ACL 2020

    A. Calabrese, M. Bevilacqua, R. Navigli

    Fatality Killed the Cat or: BabelPic, a Multimodal Dataset for Non-Concrete Concepts

    Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

    ACL 2020

    BibTex

    @inproceedings{calabrese-etal-2020-fatality,
        title = "Fatality Killed the Cat or: {B}abel{P}ic, a Multimodal Dataset for Non-Concrete Concepts",
        author = "Calabrese, Agostina  and
          Bevilacqua, Michele  and
          Navigli, Roberto",
        booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
        month = jul,
        year = "2020",
        address = "Online",
        publisher = "Association for Computational Linguistics",
        url = "https://www.aclweb.org/anthology/2020.acl-main.425",
        pages = "4680--4686",
        abstract = "Thanks to the wealth of high-quality annotated images available in popular repositories such as ImageNet, multimodal language-vision research is in full bloom. However, events, feelings and many other kinds of concepts which can be visually grounded are not well represented in current datasets. Nevertheless, we would expect a wide-coverage language understanding system to be able to classify images depicting recess and remorse, not just cats, dogs and bridges. We fill this gap by presenting BabelPic, a hand-labeled dataset built by cleaning the image-synset association found within the BabelNet Lexical Knowledge Base (LKB). BabelPic explicitly targets non-concrete concepts, thus providing refreshing new data for the community. We also show that pre-trained language-vision systems can be used to further expand the resource by exploiting natural language knowledge available in the LKB. BabelPic is available for download at http://babelpic.org.",
    }
                    
  • AI 2020

    T. Pasini, R. Navigli

    Train-O-Matic: Supervised Word Sense Disambiguation with no (manual) effort

    Artificial Intelligence, 279, Elsevier, 2020.

    AI 2020

    BibTex

    @article{PASINI2020103215,
        title = "Train-O-Matic: Supervised Word Sense Disambiguation with no (manual) effort",
        journal = "Artificial Intelligence",
        volume = "279",
        pages = "103215",
        year = "2020",
        issn = "0004-3702",
        doi = "https://doi.org/10.1016/j.artint.2019.103215",
        url = "http://www.sciencedirect.com/science/article/pii/S0004370218307021",
        author = "Tommaso Pasini and Roberto Navigli",
        keywords = "Word Sense Disambiguation, Corpus Generation, Word Sense Distribution learning, Multilinguality",
        abstract = "Word Sense Disambiguation (WSD) is the task of associating the correct meaning with a word in a given context. WSD provides explicit semantic information that is beneficial to several downstream applications, such as question answering, semantic parsing and hypernym extraction. Unfortunately, WSD suffers from the well-known knowledge acquisition bottleneck problem: it is very expensive, in terms of both time and money, to acquire semantic annotations for a large number of sentences. To address this blocking issue we present Train-O-Matic, a knowledge-based and language-independent approach that is able to provide millions of training instances annotated automatically with word meanings. The approach is fully automatic, i.e., no human intervention is required, and the only type of human knowledge used is a task-independent WordNet-like resource. Moreover, as the sense distribution in the training set is pivotal to boosting the performance of WSD systems, we also present two unsupervised and language-independent methods that automatically induce a sense distribution when given a simple corpus of sentences. We show that, when the learned distributions are taken into account for generating the training sets, the performance of supervised methods is further enhanced. Experiments have proven that Train-O-Matic on its own, and also coupled with word sense distribution learning methods, lead a supervised system to achieve state-of-the-art performance consistently across gold standard datasets and languages. Importantly, we show how our sense distribution learning techniques aid Train-O-Matic to scale well over domains, without any extra human effort. To encourage future research, we release all the training sets in 5 different languages and the sense distributions for each domain of SemEval-13 and SemEval-15 at http://trainomatic.org."
    }
                    
  • ACL 2019

    B. Scarlini, T. Pasini, R. Navigli

    Just "OneSeC" for Producing Multilingual Sense-Annotated Data

    Proc. of 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), Florence, Italy, July 28th-August 2nd, 2019, pp. 699-709.

    ACL 2019

    BibTex

    @inproceedings{scarlini-etal-2019-just,
        title = "Just {``}{O}ne{S}e{C}{''} for Producing Multilingual Sense-Annotated Data",
        author = "Scarlini, Bianca  and
          Pasini, Tommaso  and
          Navigli, Roberto",
        booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
        month = jul,
        year = "2019",
        address = "Florence, Italy",
        publisher = "Association for Computational Linguistics",
        url = "https://www.aclweb.org/anthology/P19-1069",
        doi = "10.18653/v1/P19-1069",
        pages = "699--709",
        abstract = "The well-known problem of knowledge acquisition is one of the biggest issues in Word Sense Disambiguation (WSD), where annotated data are still scarce in English and almost absent in other languages. In this paper we formulate the assumption of One Sense per Wikipedia Category and present OneSeC, a language-independent method for the automatic extraction of hundreds of thousands of sentences in which a target word is tagged with its meaning. Our automatically-generated data consistently lead a supervised WSD model to state-of-the-art performance when compared with other automatic and semi-automatic methods. Moreover, our approach outperforms its competitors on multilingual and domain-specific settings, where it beats the existing state of the art on all languages and most domains. All the training data are available for research purposes at http://trainomatic.org/onesec.",
    }
                    
  • ACL 2019

    I. Iacobacci, R. Navigli

    LSTMEmbed: Learning Word and Sense Representations from a Large Semantically Annotated Corpus with Long Short-Term Memories

    Proceedings of 57th Annual Meeting of the Association for Computational Linguistics

    ACL 2019

    BibTex

    @inproceedings{iacobacci2019lstmembed,
      title={Lstmembed: Learning word and sense representations from a large semantically annotated corpus with long short-term memories},
      author={Iacobacci, Ignacio and Navigli, Roberto},
      booktitle={Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
      pages={1685--1695},
      year={2019}
    }
                    
  • EMNLP 2019

    R. Tripodi, R. Navigli

    Game Theory Meets Embeddings: a Unified Framework for Word Sense Disambiguation

    Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP 2019), Hong Kong, China, 3-7th November, 2019.

    EMNLP 2019

    BibTex

    @inproceedings{tripodi-navigli-2019-game,
        title = "Game Theory Meets Embeddings: a Unified Framework for Word Sense Disambiguation",
        author = "Tripodi, Rocco  and
          Navigli, Roberto",
        booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
        month = nov,
        year = "2019",
        address = "Hong Kong, China",
        publisher = "Association for Computational Linguistics",
        url = "https://www.aclweb.org/anthology/D19-1009",
        doi = "10.18653/v1/D19-1009",
        pages = "88--99",
        abstract = "Game-theoretic models, thanks to their intrinsic ability to exploit contextual information, have shown to be particularly suited for the Word Sense Disambiguation task. They represent ambiguous words as the players of a non cooperative game and their senses as the strategies that the players can select in order to play the games. The interaction among the players is modeled with a weighted graph and the payoff as an embedding similarity function, that the players try to maximize. The impact of the word and sense embedding representations in the framework has been tested and analyzed extensively: experiments on standard benchmarks show state-of-art performances and different tests hint at the usefulness of using disambiguation to obtain contextualized word representations.",
    }
                    
  • EMNLP 2019

    A. Di Fabio, S. Conia, R. Navigli

    VerbAtlas: a Novel Large-Scale Verbal Semantic Resource and Its Application to Semantic Role Labeling

    Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP 2019), Hong Kong, China, 3-7th November, 2019.

    EMNLP 2019

    BibTex

    @inproceedings{di-fabio-etal-2019-verbatlas,
        title = "{V}erb{A}tlas: a Novel Large-Scale Verbal Semantic Resource and Its Application to Semantic Role Labeling",
        author = "Di Fabio, Andrea  and
          Conia, Simone  and
          Navigli, Roberto",
        booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
        month = nov,
        year = "2019",
        address = "Hong Kong, China",
        publisher = "Association for Computational Linguistics",
        url = "https://www.aclweb.org/anthology/D19-1058",
        doi = "10.18653/v1/D19-1058",
        pages = "627--637",
        abstract = "We present VerbAtlas, a new, hand-crafted lexical-semantic resource whose goal is to bring together all verbal synsets from WordNet into semantically-coherent frames. The frames define a common, prototypical argument structure while at the same time providing new concept-specific information. In contrast to PropBank, which defines enumerative semantic roles, VerbAtlas comes with an explicit, cross-frame set of semantic roles linked to selectional preferences expressed in terms of WordNet synsets, and is the first resource enriched with semantic information about implicit, shadow, and default arguments. We demonstrate the effectiveness of VerbAtlas in the task of dependency-based Semantic Role Labeling and show how its integration into a high-performance system leads to improvements on both the in-domain and out-of-domain test sets of CoNLL-2009. VerbAtlas is available at http://verbatlas.org.",
    }
                    
  • EMNLP 2019

    M. Maru, F. Scozzafava, F. Martelli, R. Navigli

    SyntagNet: Challenging Supervised Word Sense Disambiguation with Lexical-Semantic Combinations

    Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP 2019), Hong Kong, China, 3-7th November, 2019, pp. 3525-3531.

    EMNLP 2019

    BibTex

    @inproceedings{maru-etal-2019-syntagnet,
        title = "{S}yntag{N}et: Challenging Supervised Word Sense Disambiguation with Lexical-Semantic Combinations",
        author = "Maru, Marco  and
          Scozzafava, Federico  and
          Martelli, Federico  and
          Navigli, Roberto",
        booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
        month = nov,
        year = "2019",
        address = "Hong Kong, China",
        publisher = "Association for Computational Linguistics",
        url = "https://www.aclweb.org/anthology/D19-1359",
        doi = "10.18653/v1/D19-1359",
        pages = "3534--3540",
        abstract = "Current research in knowledge-based Word Sense Disambiguation (WSD) indicates that performances depend heavily on the Lexical Knowledge Base (LKB) employed. This paper introduces SyntagNet, a novel resource consisting of manually disambiguated lexical-semantic combinations. By capturing sense distinctions evoked by syntagmatic relations, SyntagNet enables knowledge-based WSD systems to establish a new state of the art which challenges the hitherto unrivaled performances attained by supervised approaches. To the best of our knowledge, SyntagNet is the first large-scale manually-curated resource of this kind made available to the community (at http://syntagnet.org).",
    }
                    
  • KBS 2019

    R. Sinoara, J. Camacho-Collados, R. Rossi, R. Navigli, S. Rezende

    Knowledge-enhanced document embeddings for text classification

    Knowledge-Based Systems, 163, Elsevier, 2019, pp. 955-971.

    KBS 2019

    BibTex

    @article{sinoara2019knowledge,
      title={Knowledge-enhanced document embeddings for text classification},
      author={Sinoara, Roberta A and Camacho-Collados, Jose and Rossi, Rafael G and Navigli, Roberto and Rezende, Solange O},
      journal={Knowledge-Based Systems},
      volume={163},
      pages={955--971},
      year={2019},
      publisher={Elsevier}
    }
                    
  • LREC 2019

    J. Camacho-Collados, C. Delli Bovi, A. Raganato, R. Navigli

    SenseDefs: a multilingual corpus of semantically annotated textual definitions - Exploiting multiple languages and resources jointly for high-quality Word Sense Disambiguation and Entity Linking

    Language Resources and Evaluation

    LREC 2019

    BibTex

    @article{camacho2019s,
      title={S ense D efs: a multilingual corpus of semantically annotated textual definitions},
      author={Camacho-Collados, Jose and Bovi, Claudio Delli and Raganato, Alessandro and Navigli, Roberto},
      journal={Language Resources and Evaluation},
      volume={53},
      number={2},
      pages={251--278},
      year={2019},
      publisher={Springer}
    }
                    
  • NLE 2019

    R. Navigli, F. Martelli

    An overview of word and sense similarity

    Natural Language Engineering

    NLE 2019

    BibTex

    @article{navigli2019overview,
      title={An overview of word and sense similarity},
      author={Navigli, Roberto and Martelli, Federico},
      journal={Natural Language Engineering},
      volume={25},
      number={6},
      pages={693--714},
      year={2019},
      publisher={Cambridge University Press}
    }
                    
  • RANLP 2019

    M. Bevilacqua, R. Navigli

    Quasi Bidirectional Encoder Representations from Transformers for Word Sense Disambiguation

    Proc. of the 2019 conference on Recent Advances in Natural Language Processing (RANLP 2019), Varna, Bulgaria, September 2-4th, 2019, pp. 122-131.

    RANLP 2019

    BibTex

    @inproceedings{bevilacqua-navigli-2019-quasi,
        title = "Quasi Bidirectional Encoder Representations from Transformers for Word Sense Disambiguation",
        author = "Bevilacqua, Michele  and
          Navigli, Roberto",
        booktitle = "Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)",
        month = sep,
        year = "2019",
        address = "Varna, Bulgaria",
        publisher = "INCOMA Ltd.",
        url = "https://www.aclweb.org/anthology/R19-1015",
        doi = "10.26615/978-954-452-056-4_015",
        pages = "122--131",
        abstract = "While contextualized embeddings have produced performance breakthroughs in many Natural Language Processing (NLP) tasks, Word Sense Disambiguation (WSD) has not benefited from them yet. In this paper, we introduce QBERT, a Transformer-based architecture for contextualized embeddings which makes use of a co-attentive layer to produce more deeply bidirectional representations, better-fitting for the WSD task. As a result, we are able to train a WSD system that beats the state of the art on the concatenation of all evaluation datasets by over 3 points, also outperforming a comparable model using ELMo.",
    }
                    
  • AAAI 2018

    T. Pasini, R. Navigli

    Two Knowledge-based Methods for High-Performance Sense Distribution Learning

    Proceedings of the 2018 Conference of the Association for the Advancement of Artificial Intelligence

    AAAI 2018

    BibTex

    @InProceedings{PasiniNavigli:2018,
      author = {Pasini, Tommaso and Navigli, Roberto},
      title = {Two Knowledge-based Methods for High-Performance Sense Distribution Learning},
      booktitle = {Proc. of the 32th {AAAI} {C}onference on {A}rtificial {I}ntelligence},
      year = {2018},
      address = {New Orleans, {USA}},
    }
                    
  • IJCAI 2018

    R. Navigli

    Natural Language Understanding: Instructions for (Present and Future) Use

    Proc. of the 27th International Joint Conference on Artificial Intelligence (IJCAI 2018), Stockholm, Sweden, 13-19 July, 2018, pp. 5697-5702.

    IJCAI 2018

    BibTex

    @inproceedings{navigli2018natural,
      title={Natural Language Understanding: Instructions for (Present and Future) Use.},
      author={Navigli, Roberto},
      booktitle={IJCAI},
      pages={5697--5702},
      year={2018}
    }
                    
  • LREC 2018

    T. Pasini, F. M. Elia, R. Navigli

    Huge Automatically Extracted Training-Sets for Multilingual Word Sense Disambiguation.

    Proceedings of the Language Resources and Evaluation Conference

    LREC 2018

    BibTex

    @inproceedings{pasini-etal-2018-huge,
        title = "Huge Automatically Extracted Training-Sets for Multilingual Word {S}ense{D}isambiguation",
        author = "Pasini, Tommaso  and
          Elia, Francesco  and
          Navigli, Roberto",
        booktitle = "Proceedings of the Eleventh International Conference on Language Resources and Evaluation ({LREC} 2018)",
        month = may,
        year = "2018",
        address = "Miyazaki, Japan",
        publisher = "European Language Resources Association (ELRA)",
        url = "https://www.aclweb.org/anthology/L18-1268",
    }
                    
  • SemEval 2018

    J. Camacho-Collados, C. Delli Bovi, L. Espinosa-Anke, S. Oramas, T. Pasini, E. Santus, V. Shwartz, R. Navigli, H. Saggion

    SemEval-2018 Task 9: Hypernym Discovery

    Proc. of the 12th International Workshop on Semantic Evaluation

    SemEval 2018

    BibTex

    @inproceedings{camacho-collados-etal-2018-semeval,
        title = "{S}em{E}val-2018 Task 9: Hypernym Discovery",
        author = "Camacho-Collados, Jose  and
          Delli Bovi, Claudio  and
          Espinosa-Anke, Luis  and
          Oramas, Sergio  and
          Pasini, Tommaso  and
          Santus, Enrico  and
          Shwartz, Vered  and
          Navigli, Roberto  and
          Saggion, Horacio",
        booktitle = "Proceedings of The 12th International Workshop on Semantic Evaluation",
        month = jun,
        year = "2018",
        address = "New Orleans, Louisiana",
        publisher = "Association for Computational Linguistics",
        url = "https://www.aclweb.org/anthology/S18-1115",
        doi = "10.18653/v1/S18-1115",
        pages = "712--724",
        abstract = "This paper describes the SemEval 2018 Shared Task on Hypernym Discovery. We put forward this task as a complementary benchmark for modeling hypernymy, a problem which has traditionally been cast as a binary classification task, taking a pair of candidate words as input. Instead, our reformulated task is defined as follows: given an input term, retrieve (or discover) its suitable hypernyms from a target corpus. We proposed five different subtasks covering three languages (English, Spanish, and Italian), and two specific domains of knowledge in English (Medical and Music). Participants were allowed to compete in any or all of the subtasks. Overall, a total of 11 teams participated, with a total of 39 different systems submitted through all subtasks. Data, results and further information about the task can be found at \url{https://competitions.codalab.org/competitions/17119}.",
    }
                    
  • WWW 2018

    V. Basile, R. Navigli

    From MultiJEDI to MOUSSE: Two ERC Projects for Innovating Multilingual Disambiguation and Semantic Parsing of Text

    Proceedings of the Web Conference 2018

    WWW 2018

    BibTex

    @inproceedings{BasileNavigli:18,
      title = {From MultiJEDI to MOUSSE: Two ERC Projects for Innovating Multilingual Disambiguation and Semantic Parsing of Text},
      author = {Basile, Valerio and Navigli, Roberto},
      booktitle = {Proc. of The Web Conference 2018},
      address = {Lyon, France},
      year = {2018},
    }