lemmatize
Lemmatization is the process of reducing a word to its lemma, the canonical base form found in a dictionary. It uses vocabulary and morphological analysis to determine the appropriate dictionary form, and it often requires the word’s part of speech to select the correct lemma. This makes lemmatization more linguistically informed than simple stemming, which relies on heuristic affix stripping.
In practice, a lemmatizer consults a lexical resource that lists lemma forms for different inflections and
Lemmatization differs from stemming in that it aims to produce valid dictionary forms rather than arbitrary
Applications of lemmatization include information retrieval, text preprocessing for natural language processing, and any task that