Trigrams

Trigrams are sequences of three adjacent items drawn from a text or other sequence. They form a specific case of the n-gram concept used in linguistics and natural language processing to model local context. Trigrams can be defined at the word level, as three consecutive words, or at the character level, as three consecutive characters.

In language modeling, the probability of a word given its two predecessors is P(w3 | w1, w2). This

Trigrams have broad applications in text processing, including predicting the next word in a sentence, speech

Limitations of trigram models include data sparsity as the number of possible triples grows, and the fact

a

a

morphologically