Home

hotdeck

Hotdeck is a data imputation technique used to fill in missing values in a dataset by borrowing observed values from similar records, called donors. It is a non-parametric approach that does not rely on fitting a statistical model to the data; instead, it uses the observed data within a defined donor pool to replace missing entries.

The imputation process typically involves three steps. First, the data are partitioned into imputation classes or

Variants of hotdeck differ in how donors are chosen and how decks are formed. Classical hotdeck uses

Advantages include preservation of the original data distribution and simplicity of implementation. Limitations include potential bias

decks
based
on
variables
related
to
the
variable
being
imputed
or
the
missingness
mechanism.
Second,
for
each
record
with
a
missing
value,
a
donor
is
selected
from
the
corresponding
deck,
either
at
random
or
using
a
distance-based
criterion
like
nearest
neighbors.
Third,
the
donor’s
value
for
the
target
variable
is
copied
into
the
missing
slot.
In
some
applications,
multiple
donors
may
be
used
to
create
several
imputed
datasets,
enabling
analysis
that
accounts
for
imputation
uncertainty.
fixed
imputation
classes,
while
nearest-neighbor
hotdeck
selects
donors
based
on
similarity
across
multiple
variables.
Hotdeck
can
impute
one
variable
at
a
time
or
iteratively
across
several
variables.
It
is
suitable
for
both
numeric
and
categorical
data,
with
imputed
values
chosen
from
the
corresponding
donor’s
value.
if
the
donor
pool
is
not
representative
or
if
the
matching
criteria
do
not
adequately
capture
relationships
among
variables.
Multiple-imputation
extensions
help
mitigate
uncertainty
and
improve
inferential
validity.
Hotdeck
is
commonly
used
in
survey
data,
census
processing,
and
other
contexts
where
maintaining
observed
data
integrity
is
important.