Word Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison


In this page we have gathered together and unified five standard all-words Word Sense Disambiguation datasets. We hope this unified framework will ease the work of researchers to evaluate the models and will enable a fair comparison among all systems. This evaluation framework is currently available for English, and all the sense annotations belong to WordNet 3.0 sense inventory.

We are aiming to extend this framework to different languages and sense inventories. If you would like to contribute with your data to this framework, please check the "Share your data" section.

Download the whole evaluation framework here (165MB). Alternatively, you can download individual training corpora and evaluation datasets.

We have set up a CodaLab competition based on this evaluation framework at https://competitions.codalab.org/competitions/15984 . Join our Google Group to post your questions/suggestions about this unified framework.



If you use data or information from this website, please cite the following reference paper:

Alessandro Raganato, Jose Camacho-Collados and Roberto Navigli.
Proceedings of EACL 2017, Valencia, Spain