CaseNom
CaseNom is a framework for labeling nominal case information in linguistic annotation and natural language processing. It provides a compact, language-agnostic tag set designed to indicate the syntactic role of nouns across languages, facilitating consistent annotation and data exchange among corpora and tools. The core set includes NOM for nominative, ACC for accusative, GEN for genitive, DAT for dative, LOC for locative, INS for instrumental, and VOC for vocative, with an UNKNOWN option for unclear cases. The tag set is intended to be extensible to accommodate language-specific phenomena and cliticized or multi-part case systems.
Implementation and usage: In annotation pipelines, CaseNom tags are attached to tokens alongside lemmas, e.g., dog-NOM
Origins and scope: CaseNom was proposed as a standardization concept in cross-language grammar and corpus work
Applications and limitations: It serves corpus linguistics, syntactic parsing, morphological analysis, lexicon development, and machine translation.
Example: English: the dog-NOM chased the cat-ACC. German: Der Hund-NOM jagt die Katze-ACC. Turkish: Kedi-NOM köpeği-ACC