Home

mixup

Mixup is a data augmentation technique used in supervised machine learning to improve generalization by creating synthetic training examples through linear interpolation of pairs of original samples and their labels. It was introduced to encourage models to behave linearly in-between training data points, thereby reducing overfitting and memorization of exact training instances.

In the standard formulation, two training examples (x_i, y_i) and (x_j, y_j) are selected at random, and

Variants and related methods include CutMix, which replaces a region of one image with a region from

Applications and impact: mixup is widely used to improve robustness, calibration, and accuracy across diverse tasks.

a
mixing
coefficient
λ
is
drawn
from
a
Beta
distribution,
typically
Beta(α,
α).
The
mixed
input
is
x̃
=
λ
x_i
+
(1
−
λ)
x_j,
and
the
corresponding
mixed
label
is
ỹ
=
λ
y_i
+
(1
−
λ)
y_j.
When
labels
are
one-hot
vectors,
ỹ
becomes
a
soft
label
representing
a
convex
combination
of
the
original
classes.
The
parameter
α
controls
the
strength
of
interpolation:
small
values
keep
samples
close
to
originals,
while
larger
values
produce
more
blended
inputs
and
labels.
another
while
adjusting
the
labels
accordingly,
and
Manifold
Mixup,
which
applies
interpolation
in
hidden
feature
representations
rather
than
the
input
space.
Mixup
has
been
extended
to
domains
beyond
images,
such
as
audio
and
text
at
embedding
levels,
and
often
complements
other
regularization
techniques.
Potential
drawbacks
include
diminished
performance
on
tasks
requiring
precise
decision
boundaries
or
when
the
α
parameter
is
not
well-tuned.
Overall,
mixup
offers
a
simple,
effective
regularization
approach
that
can
be
integrated
into
standard
training
pipelines.