Unigram
Unigram refers to a single unit, typically a word, used in linguistic analysis and natural language processing. In the context of language modeling, a unigram model is an order-1 model that assumes each word occurs independently of the others. Under this assumption, the probability of a sentence w1 w2 ... wn is approximated as the product of the probabilities of its words: P(w1, w2, ..., wn) ≈ ∏i P(wi).
Computation of unigram probabilities is usually based on a text corpus. If N is the total number
Applications and limitations: Unigram models are simple and fast, and they serve as baselines for tasks such