lemmatiseerde
Lemmatization is the process of reducing words to their lemma, the canonical base form found in dictionaries. In Dutch, the past participle form lemmatiseerde is used to describe text that has been processed to replace words with their lemmas. A lemmatized text typically uses dictionary forms such as lopen for all inflected verb forms and the singular form for nouns.
In practice, lemmatization combines morphological analysis with lemmat dictionaries. Many systems use a pipeline that includes
Lemmatization differs from stemming. Stemming cuts words to a rough stem that may not be a valid
Applications of lemmatized text include improved information retrieval, search over inflected languages, text mining, machine translation,
Example: the Dutch words loopt, liep, gelopen all map to the lemma lopen in a lemmatized representation.