kontekstem
Kontekstem is a concept in computational linguistics designed to balance the benefits of stemming with the need to preserve contextual meaning. A kontekstem represents a word’s canonical stem together with a contextual tag that signals how the form is used in a particular linguistic or domain setting. The approach aims to reduce surface form variation while maintaining distinctions that are relevant to downstream tasks such as information retrieval or classification.
Methodology: The process begins with tokenization and morphological analysis to extract a stem from each token.
Applications and advantages: By aggregating inflected forms under context-aware stems, kontekstem can reduce feature sparsity while
Examples: In sports and health literature, "running" and "ran" may map to kontekstem run with context sport,
See also: stemming, lemmatization, contextual word representations, polysemy, information retrieval.