nGramme
NGramme is a term used in computational linguistics to denote a family of techniques and software resources that analyze text by counting occurrences of n-grams—contiguous sequences of n tokens or characters—from a corpus. It covers language-model construction, feature extraction for machine learning, and analytical workflows that rely on n-gram statistics.
NGramme methods can operate at the word level or the character level. They estimate probabilities such as
A typical NGramme pipeline includes text ingestion, tokenization, n-gram extraction, model training, and scoring. Implementations store
Practical applications include predictive text and autocomplete, spell-checking, information retrieval ranking, authorship attribution, language identification, and
N-gram concepts emerged in statistical natural language processing in the latter part of the 20th century and
Although the spelling NGramme with a capital G is sometimes used to brand libraries or projects in
Related topics include n-gram, language model, smoothing, and information retrieval.