Home

correlations

Correlation is a statistical measure that describes how two variables relate to each other. It quantifies both the strength and the direction of an association, whether the relationship is approximately linear or monotonic, and it is symmetric: the correlation between X and Y is the same as between Y and X. Correlation does not by itself imply that one variable causes changes in the other.

Several coefficients are used. Pearson r assesses linear relationships for interval or ratio data and ranges

Correlation is often summarized in a correlation matrix for many variables and visualized with scatter plots.

Limitations include sensitivity to outliers, restricted data ranges, and nonlinearity that can mask relationships. Spurious correlations

Common uses span science, engineering, economics, and finance, including feature selection, risk assessment, and pattern discovery.

from
-1
to
1,
with
values
near
1
or
-1
indicating
a
strong
straight-line
relationship
and
values
near
0
indicating
little
linear
association.
Spearman
rho
and
Kendall
tau
are
nonparametric
measures
based
on
ranks
and
are
suitable
for
ordinal
data
or
nonnormal
relationships;
both
also
range
from
-1
to
1.
Interpretation
follows
that
direction
is
positive
or
negative;
magnitude
is
strength
but
context
matters.
Partial
or
control
correlation
removes
the
effect
of
one
or
more
confounding
variables.
In
time
series
analysis,
autocorrelation
measures
how
a
variable
relates
to
itself
at
a
later
time.
can
arise
from
coincidences
or
confounding
factors.
Causality
requires
additional
evidence
from
experiments
or
causal
modeling
rather
than
correlation
alone.
Researchers
report
the
chosen
coefficient,
sample
size,
and
confidence
intervals
to
convey
precision
and
reliability.