Home

minibatches

Minibatches are subsets of a training dataset used to update model parameters during an iteration of optimization. They lie between full-batch updates and single-sample updates, and are central to mini-batch gradient descent. By processing several examples at once, minibatches enable vectorized computation and more stable gradient estimates than stochastic updates, while avoiding the memory burden of a full batch.

Typical minibatch sizes range from small (16–32) to larger (256–512), with common defaults such as 32, 64,

In practice, data are shuffled and divided into minibatches each epoch. In many frameworks you specify a

Minibatches are widely used in deep learning and other machine learning settings. They enable efficient parallel

See also: stochastic gradient descent, batch gradient descent, batch normalization, data loader.

---

or
128.
The
choice
depends
on
data
size,
model
complexity,
and
hardware
constraints
like
GPU
memory.
Smaller
batches
introduce
more
gradient
noise,
which
can
help
exploration
and
generalization
in
some
cases,
while
larger
batches
reduce
noise
but
require
more
memory
and
may
slow
updates
or
affect
generalization
in
others.
batch_size
parameter,
and
data
loaders
handle
batching
and
may
perform
on-the-fly
data
augmentation.
When
memory
is
limited,
gradient
accumulation
can
simulate
larger
batches
by
summing
gradients
over
several
forward
passes
before
performing
an
update.
computation
on
hardware
accelerators
and
are
integral
to
optimization
routines
such
as
stochastic
and
mini-batch
gradient
descent.
Some
techniques,
like
batch
normalization,
rely
on
statistics
computed
over
the
minibatch.
Minibatches
also
feature
in
distributed
and
streaming
contexts,
where
data
are
processed
in
micro-batches
in
parallel.