Home

Bagging

Bagging, short for bootstrap aggregating, is an ensemble learning technique designed to improve predictive accuracy and stability. It builds multiple versions of a predictor by creating bootstrap samples—samples drawn with replacement—from the training data. A base learning algorithm is trained on each bootstrap sample, and the resulting models are combined to form a single aggregated predictor. Bagging was introduced by Leo Breiman in 1994.

Classification tasks use majority voting to determine the final class; regression uses averaging of the predicted

The method also enables out-of-bag error estimation: on average, about 1 − 1/e ≈ 0.37 of the training

A well-known extension is the random forest, which combines bagging with random feature selection when building

values.
Because
each
model
is
trained
on
a
different
sample,
their
errors
tend
to
cancel
out,
reducing
overall
variance.
Bagging
is
particularly
effective
with
high-variance,
low-bias
base
learners,
such
as
decision
trees.
instances
are
not
included
in
a
given
bootstrap
sample,
and
their
predictions
can
be
used
to
estimate
the
model’s
generalization
error
without
separate
validation
data.
trees.
Bagging
can
be
applied
to
any
suitable
base
learner,
but
the
gains
depend
on
the
bias-variance
characteristics.
Computational
cost
increases
with
the
number
of
base
learners.