Home

Doublecentering

Double-centering is a matrix operation used to transform a square matrix into a centered form by removing its row and column effects and then adding the overall mean. The resulting matrix has zero row sums and zero column sums, which makes it useful for revealing intrinsic structure in the data. It is commonly applied to distance or similarity matrices and to data matrices in multivariate analysis.

Given an n-by-n matrix A, the double-centering procedure computes row means r_i = (1/n) sum_j a_ij, column

In classical multidimensional scaling (MDS), double-centering is applied to the squared distance matrix D^2 to obtain

Properties and caveats include that double-centering enforces zero sums across rows and columns; if the original

means
c_j
=
(1/n)
sum_i
a_ij,
and
the
grand
mean
m
=
(1/n^2)
sum_i
sum_j
a_ij.
The
double-centered
entry
is
a*_ij
=
a_ij
−
r_i
−
c_j
+
m.
If
A
is
symmetric,
the
row
and
column
means
align
so
that
a*_ij
=
a*_ji
and
the
matrix
is
centered
in
both
dimensions.
the
Gram
matrix
B,
via
B
=
−1/2
J
D^2
J,
where
J
=
I
−
(1/n)
11^T.
The
Gram
matrix
B
contains
inner
products
and
can
be
decomposed
spectrally
to
recover
coordinates
in
Euclidean
space.
Double-centering
thus
links
pairwise
distances
to
a
centered
inner-product
representation.
distances
arise
from
an
exact
Euclidean
embedding,
the
resulting
matrix
is
positive
semidefinite.
If
the
distances
are
not
Euclidean,
B
may
have
negative
eigenvalues,
signaling
non-Euclidean
structure
in
the
data.
The
operation
is
also
used
in
kernel
methods
to
produce
centered
kernel
matrices
from
raw
similarity
measures.