SemAlign: A Robust Approach to Aligning Heterogeneous Lexical Resources

SemAlign is a hybrid approach for robust alignment of arbitrary pairs of lexical resources, irrespective of their structure or availability of training data. SemAlign first transforms a given lexical resource into a semantic network and then aligns pairs of lexical resources based on the cross-level semantic similarity of their individual concepts. The approach leverages a similarity measure that enables the structural comparison of senses across lexical resources. SemAlign was applied effectively to aligning WordNet to three different collaborative resources: Wikipedia, Wiktionary and OmegaWiki [1].

SemAlign

Datasets


Click here to download the WordNet-OmegaWiki dataset used in our experiments.



Read more about SemAlign:

[1] Mohammad Taher Pilehvar and Roberto Navigli.
A Robust Approach to Aligning Heterogeneous Lexical Resources
.
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014) , Baltimore, USA, June 22-27, 2014, pp. 468-478.


Last update: 30 Oct. 2014 by Mohammad Taher Pilehvar