Corpussuch
Corpussuch is a neologism in linguistics and digital humanities describing a framework for building and analyzing language corpora in ways that emphasize representativeness and comparability across domains. The term is used to refer to practices that document corpus compilation, sampling, annotation, and analysis to ensure that findings generalize beyond a single data source.
Origin and concept: The name combines corpus and such, signaling the focus on constructing corpora that exemplify
Principles and methods: Emphasizes transparent methods: specify data sources, sampling strategies (random, stratified), annotation schemas, versioning,
Applications and examples: Used to study lexical variation, pragmatics, and discourse patterns across genres (news, social
Reception and critique: As a concept, it promotes reproducibility but may be resource-intensive and risk over-emphasizing
See also and references: See also Corpus linguistics; Reproducibility; Data ethics.