Keyness

Keyness is a term used in corpus linguistics to describe how characteristic a word or lexical item is of a target corpus relative to a reference corpus. A word is said to have high keyness if it occurs more frequently in the target corpus than would be expected given the reference distribution; conversely, a word with low or negative keyness is underrepresented in the target corpus.

Calculation typically involves comparing observed word frequencies in the target and reference corpora with an expected

Keyness analysis is widely used for exploratory discourse analysis, genre description, authorial studies, historical linguistics, and

Limitations and cautions include dependence on the quality and comparability of the corpora, sample size effects,

a

permutation-based

a

a

sociolinguistic

characteristics.

a

a