Underdata
Underdata is a term used to describe datasets that are incomplete, underrepresented, or insufficient to support reliable conclusions. It denotes a condition where data coverage, granularity, or quality falls short of what is required for robust analysis, often due to gaps, biases, or privacy constraints. Although not a formal statistical term, underdata is used in data governance, journalism, and risk assessment to highlight limitations in evidence.
Characteristics of underdata include missing values, sparse sampling, biased coverage across time, space, or populations, small
Causes commonly involve budget constraints, regulatory or ethical restrictions, technical failures, nonresponse or attrition, and deliberate
Mitigation strategies focus on reducing gaps and quantifying uncertainty. Approaches include data augmentation or imputation where
Examples of underdata appear in public health with rare diseases, climate research with sparse sensor networks,