Home

outofdistribution

Out-of-distribution (OOD) refers to inputs that do not lie within the distribution of data used to train a machine learning model. In supervised learning, the training data are assumed to be representative of the situations the model will encounter at deployment; OOD inputs violate this assumption, potentially causing unreliable predictions, miscalibrated uncertainty estimates, and degraded performance. OOD phenomena arise from distributional shifts such as covariate shift, concept drift, domain shift, or the appearance of novel classes not seen during training.

Detection and mitigation strategies aim to identify OOD samples and reduce their impact. Common approaches include

Evaluation of OOD performance typically involves testing on in-distribution data alongside separate out-of-distribution datasets, using metrics

Challenges persist in defining what constitutes OOD in a given context, balancing detection with maintaining in-distribution

calibrating
predicted
probabilities
(for
example
with
temperature
scaling),
applying
confidence
thresholds
or
entropy-based
rejection,
and
using
distance-
or
density-based
methods
in
feature
space
(such
as
k-nearest
neighbors,
Gaussian
or
flow-based
density
estimation).
Open-set
or
novelty
detection
techniques,
one-class
classifiers,
and
ensemble
methods
are
also
employed.
In
some
settings,
models
are
trained
to
be
robust
to
OOD
data
or
to
abstain
from
predicting
when
uncertainty
is
high,
sometimes
through
specific
training
objectives
or
exposure
to
outliers.
such
as
AUROC,
false-positive
rate
at
a
fixed
true-positive
rate,
or
detection
error.
Benchmarks
and
methods
vary
across
domains,
including
computer
vision,
natural
language
processing,
and
speech
processing.
accuracy,
and
reliably
calibrating
uncertainty
in
high-dimensional
spaces.
Robust
OOD
handling
remains
an
active
area
of
research
with
practical
importance
for
safety
and
reliability.