Embedding Words and Senses Together via Joint Knowledge-Enhanced Training


We present SW2V (Senses and Words to Vectors), a new model which simultaneously learns embeddings for both words and senses as an emerging feature by exploiting knowledge from both text corpora and semantic networks in a joint training phase. Word and sense embeddings are therefore represented in the same vector space.

Data and Code

Currently available files for download:


  • Code for obtaining word and sense embeddings from any pre-processed corpus by applying SW2V, including a README file.


  • 300-dimensional pre-trained word and sense embeddings trained on Wikipedia (download here) or the UMBC corpus (download here).


  • Reference paper

    When using these resources, please refer to the following paper:

    Massimiliano Mancini, Jose Camacho-Collados, Ignacio Iacobacci and Roberto Navigli.
    Embedding Words and Senses Together via Joint Knowledge-Enhanced Training.
    In arXiv:1612.02703, 2016.

    Contact

    Should you have any enquiries about any of the resources, please contact Massimiliano Mancini (mancini [at] dis.uniroma1 [dot] it), Jose Camacho Collados (collados [at] di.uniroma1 [dot] it), Ignacio Iacobacci (iacobacci [at] di.uniroma1 [dot] it) or Roberto Navigli (navigli [at] di.uniroma1 [dot] it).




    Last update: 9 Dec. 2016