stopsõnade
Stopsõnade (stop words) are common function words that occur very frequently in a language and typically carry little lexical meaning on their own. In text processing and information retrieval they are often removed from texts before indexing or querying to reduce dimensionality and computational cost, and to improve the signal-to-noise ratio.
Because languages differ, stop word lists are language-specific and domain-specific. A list for Estonian will include
Removal is not universal: some tasks benefit from retaining stop words to preserve syntax, discourse structure,
In Estonian, the high level of inflection means that many stop words have multiple morphological variants;
Applications include search engines, document indexing, and text mining, where removing stop words reduces index size
Debate exists about the value of stop words for specific tasks; modern NLP often uses lighter or