OntoLearn Reloaded
is a graph-based approach to learning a lexical taxonomy automatically starting from a domain corpus and the Web.
The system is based on Word-Class Lattices and a taxonomy learning algorithm developed by Roberto Navigli, Paola Velardi and Stefano Faralli
Dataset
We are releasing terminologies and taxonomies for the domains of: Artificial Intelligence, Finance, Animals, Plants, Vehicles, Viruses.
The following terminologies are gold standard or automatically extracted from a domain corpus.
The taxonomies are distributed as tab-separated values (), with the sequence (term,hypernym,gloss) when a gloss is available and (term,hypernym) otherwise. For each domain we release a
tree-like taxonomy (TREE) and two directed acyclic graph (DAG_1_3,DAG_0_99) obtained with different parameters of our graph-based approach.
We are realeasing the OWL/RDF version () and the Lemon version () of the ontologies.
Downloads
Artificial Intelligence, extracted from the entire IJCAI proceedings from 1969 to 2011 and the ACL archive from year 1979 to 2010. Terminology:txt TREE:tsv • owl • lemon DAG_1_3:tsv • owl • lemon DAG_0_99:tsv • owl • lemon
Finance, extracted from the freely available collection of "Journal of Financial Economics" from 1995 to 2012 and from "Review Of Finance" from 1997 to 2012. Terminology:txt TREE:tsv • owl • lemon DAG_1_3:tsv • owl • lemon DAG_0_99:tsv • owl • lemon
Animals, used for comparison against the Animals sub-hierarchy of WordNet. Terminology*:txt TREE:tsv • owl • lemon DAG_1_3:tsv • owl • lemon DAG_0_99:tsv • owl • lemon
Plants, used for comparison against the Plants sub-hierarchy of WordNet. Terminology*:txt TREE:tsv • owl • lemon DAG_1_3:tsv • owl • lemon DAG_0_99:tsv • owl • lemon
Vehicles, used for comparison against the Vehicles sub-hierarchy of WordNet. Terminology*:txt TREE:tsv • owl • lemon DAG_1_3:tsv • owl • lemon DAG_0_99:tsv • owl • lemon
Viruses, used for comparison against the Viruses sub-hierarchy of MeSH. Terminology:txt TREE:tsv • owl • lemon DAG_1_3:tsv • owl • lemon DAG_0_99:tsv • owl • lemon
* The terminology for the Animals, Plants and Vehicles domains was kindly provided by Zornitsa Kozareva and Ed Hovy (note that terms are in their plural form).
Additional Downloads
We are also releasing the output of OntoLearn Reloaded on additional domain-specific corpora: