SemCor - Infinite Lexicon - Infinite Lexicon

SemCor

SemCor (Semantic Concordance) is a large, manually sense-tagged corpus of English designed for word sense disambiguation research. It aligns tokens with sense identifiers from WordNet, providing a gold-standard resource for evaluating and training WSD methods. The corpus is annotated at the word sense level; each content word in the text is tagged with the intended WordNet sense, and the annotation includes part-of-speech information.

The corpus comprises texts drawn from various genres to ensure lexical and stylistic diversity. The tagging

SemCor has become a standard benchmark in the word sense disambiguation community and is frequently used for

Availability and history: SemCor was produced in the late 1990s to early 2000s by researchers affiliated with