Home

pointbiserial

The point-biserial correlation coefficient, typically denoted r_pb, is a measure of the strength and direction of the association between a dichotomous (binary) variable and a continuous variable. It is a special case of the Pearson correlation when one variable is binary coded as 0 and 1. It is commonly used in psychology, education, medicine, and social sciences to assess how a binary grouping relates to a quantitative outcome.

Calculation is based on the means of the continuous variable in the two groups defined by the

Interpretation focuses on direction and strength. The value of r_pb ranges from -1 to 1. A positive

Assumptions and cautions include that Y should be measured on at least an interval scale, observations should

binary
variable.
Let
X
be
the
binary
variable
with
values
0
and
1,
having
proportions
p
and
q
=
1
−
p,
respectively.
Let
Y
be
the
continuous
variable
with
overall
standard
deviation
s,
M0
be
the
mean
of
Y
when
X
=
0,
and
M1
be
the
mean
of
Y
when
X
=
1.
The
point-biserial
correlation
is
r_pb
=
(M1
−
M0)
*
sqrt(p
q)
/
s.
Equivalently,
r_pb
is
the
Pearson
correlation
between
X
and
Y
when
X
is
coded
as
0
and
1.
r_pb
indicates
higher
Y
values
are
associated
with
the
group
coded
as
1;
a
negative
r_pb
indicates
higher
Y
values
with
the
group
coded
as
0.
The
magnitude
reflects
the
strength
of
the
association,
with
guidelines
similar
to
Pearson’s
r,
though
context
matters.
be
independent,
and
unequal
group
sizes
can
affect
stability.
Like
all
correlations,
r_pb
does
not
imply
causation
and
captures
linear
association;
non-linear
relationships
may
not
be
well
represented
by
r_pb.