Home

meancentered

Mean-centered, or mean-centering, is a data preprocessing technique in statistics and data analysis. It involves subtracting the mean value of a data feature from each observation so that the feature’s mean becomes zero. This can be done per feature (column) in a data matrix, though centering can also be applied across rows in other contexts.

The typical procedure for a data matrix X with n observations and p variables is to compute

Mean-centering differs from standardization, which scales features to have unit variance in addition to zero mean.

the
mean
of
each
variable
across
observations,
then
subtract
these
means
from
the
corresponding
entries.
The
resulting
matrix
has
column
means
of
zero.
Mean-centering
is
often
a
preparatory
step
for
multivariate
methods
such
as
principal
component
analysis
(PCA),
factor
analysis,
and
canonical
correlation,
where
centering
ensures
that
analyses
reflect
variation
around
the
mean
rather
than
absolute
values.
It
is
also
used
in
linear
regression
to
simplify
interpretation
and,
in
some
cases,
to
reduce
multicollinearity
when
all
variables
are
included
without
an
intercept.
It
can
also
be
applied
to
whole
data
sets
or
to
specific
subsets,
and
in
some
contexts,
centering
is
done
across
observations
rather
than
features.
Practical
considerations
include
handling
missing
values,
which
must
be
addressed
before
centering,
and
recognizing
that
centering
can
alter
interpretability
of
intercepts
and
predictions
in
certain
models.
Double
centering
and
other
variants
may
be
used
in
specialized
matrix
normalization
tasks.