termbased
Termbased is an approach in information retrieval and text processing that relies on textual terms as the primary features for representing documents and queries. In a termbased framework, documents are often converted into a bag-of-words or n-gram representation, where each unique term is a dimension and its value reflects frequency or presence. This focus on discrete terms forms the basis for many traditional retrieval and classification tasks.
Common techniques in termbased systems include indexing by terms, term frequency (TF), inverse document frequency (IDF),
Applications of termbased methods include classic search engines, document retrieval, spam filtering, topic classification, and text
Limitations of termbased approaches include limited handling of synonymy, homographs, and polysemy, as well as vocabulary