regressioimputation
Regressioimputation, commonly referred to as regression imputation, is a statistical method for handling missing data by filling in missing values with predictions derived from a regression model. The approach uses observed data to estimate the relationship between the variable with missing values and a set of other variables that are available. A regression model is fitted with the target variable as the dependent variable and the predictors as the independent variables. The model is then used to predict the missing entries. In single regression imputation, the predicted values serve as substitutes for the missing data. For binary or categorical targets, logistic or multinomial regression can be employed.
Key assumptions include that the data are missing at random (MAR) or missing completely at random, and
However, regression imputation has notable limitations. It typically underestimates the uncertainty associated with missing values because
Regressioimputation often serves as a baseline method in exploratory analyses or as a component within more