1055linear124 - Infinite Lexicon - Infinite Lexicon

1055linear124

1055linear124 is a synthetic benchmark dataset used in statistical learning and machine learning research to evaluate linear modeling techniques. The designation reflects its core dimensions and structure: 1055 samples and 124 features, with data generated under a linear data-generating process.

Construction: The feature matrix X is drawn from a standard normal distribution and standardized. The true

Usage and relevance: 1055linear124 serves as a common testbed for comparing estimation methods such as ordinary

Variants and availability: Several variants of the dataset exist, differing in noise level, sparsity, or feature

a

y

=

X

a

signal-to-noise

a

multicollinearity,

interpretability.

implementations

reproducibility