lemmatizáló
A lemmatizáló, or lemmatizer, is a natural language processing component that reduces inflected or derived word forms to their lemma, the canonical base form found in a dictionary. Lemmas serve as standardized representations of words for tasks such as indexing, search, and linguistic analysis.
Most lemmatizers combine morphological analysis with dictionaries or rules. They may be rule-based, rely on statistical
Lemmatization differs from stemming. A lemmatizer aims to produce a valid base form that exists in the
Typical workflow includes tokenization, part-of-speech tagging, disambiguation using context, and mapping surface forms to lemmas through
Applications span information retrieval, text mining, machine translation, and various NLP tasks that benefit from normalized