corpuslinguïstisch

Corpuslinguïstisch, or corpus linguistics, is a branch of linguistics that studies language by collecting and analyzing large electronic collections of texts called corpora. It combines quantitative methods, such as frequency counts and statistical analyses, with qualitative analysis of patterns in actual language use. The central idea is that language phenomena are best understood by examining large, authentic samples rather than isolated examples.

Corpora vary in size and scope and can be general, covering broad language use across genres, or

Common methods include concordance analysis, keyword and keyness analysis, collocation and dispersion studies, frequency profiling, and

Limitations include representativeness, sampling bias, annotation quality, and the resource demands of less-studied languages. Ethical considerations

language-specific

domain-specific

sociolinguistics,

reproducibility.