BabelDomains: Large-Scale Domain Labeling of Lexical Resources

BabelDomains is a unified resource that includes domain information for lexical items included in different lexical resources (BabelNet, Wikipedia and WordNet).
Each synset is associated with a pre-defined domain of knowledge from the Wikipedia featured articles page.
The last version of BabelDomains has been integrated in BabelNet, both into the online interface and in the API.


Download the whole package [45MB], which includes the following resources:

  • A README file.

  • The BabelDomains resource, with domain information for BabelNet, Wikipedia and WordNet.

  • A file containing the list with all 34 domains.

  • A file containing the seeds for each domain from the Wikipedia featured articles.

  • A file containing the lexical vectors for each domain (i.e. a sorted list of words representing the domain).

  • Two gold standard datasets for WordNet and BabelNet used in the evaluation.

  • A Python script to evaluate the domain annotations with respect to the two gold standard datasets.

  • The reference paper (see below).

  • Reference paper

    When using these resources, please refer to the following paper:

    José Camacho-Collados and Roberto Navigli.
    BabelDomains: Large-Scale Domain Labeling of Lexical Resources.
    In Proceedings of EACL (short), Valencia, Spain, 2017.


    BabelDomains is licensed under a Creative Commons Attribution - Noncommercial - Unported 3.0 License. Creative Commons License


    Should you have any enquiries about any of the resources, please contact José Camacho Collados (collados [at] di.uniroma1 [dot] it) or Roberto Navigli (navigli [at] di.uniroma1 [dot] it).

    BabelDomains is an output of the MultiJEDI ERC Starting Grant No. 259234.

    Last update: 17 Feb. 2017 by José Camacho Collados