korpuszméret
Korpuszméret refers to the size of a text corpus, which is a large and structured collection of texts. This size is a crucial factor in corpus linguistics and computational linguistics, as it directly influences the reliability and scope of any analysis performed on the data. A larger corpus generally provides a more representative sample of language use, allowing for more robust statistical findings and the identification of rarer linguistic phenomena. Conversely, a smaller corpus might yield results that are less generalizable and potentially skewed by the limited data.
The size of a corpus is typically measured in the number of words or tokens it contains.
The concept of korpuszméret is also intertwined with the notion of corpus balance, which refers to the