Home

Mehrfachimputation

Mehrfachimputation, or multiple imputation, is a statistical method for handling missing data by creating several plausible complete datasets, analyzing each one separately, and then combining the results to account for uncertainty due to missingness. The central idea is to replace every missing value with a set of plausible values drawn from a distribution conditional on the observed data, producing m complete datasets.

The typical workflow consists of four steps. First, specify an imputation model that captures the relations

Imputation models can be parametric, such as joint modeling with a multivariate distribution, or semi-parametric, as

Limitations include dependence on correct model specification, computational demands, and potential bias if data are missing

among
variables
and
respects
the
data
structure.
Second,
generate
m
imputations
to
create
m
complete
datasets.
Third,
analyze
each
dataset
with
the
intended
statistical
method.
Fourth,
pool
the
results
across
imputations
using
Rubin’s
rules
to
obtain
a
single
estimate
and
an
associated
standard
error
that
reflect
both
within-imputation
and
between-imputation
variability.
in
fully
conditional
specification
(chained
equations)
or
predictive
mean
matching.
The
method
is
most
appropriate
under
missing
at
random
(MAR)
assumptions,
meaning
the
probability
of
missingness
may
depend
on
observed
data
but
not
on
unobserved
values.
Including
relevant
auxiliary
variables
in
the
imputation
model
helps
satisfy
MAR
and
reduce
bias.
Results
are
interpreted
as
incorporating
uncertainty
about
missing
values,
with
pooling
yielding
valid
inferences
under
the
imputation
model.
not
at
random
(MNAR)
without
sensitivity
analyses.
Mehrfachimputation
has
become
a
standard
tool
in
many
fields,
enabling
more
efficient
and
less
biased
use
of
incomplete
data.