Home

Dstatistic

D-statistic, also known as Patterson’s D-statistic or the ABBA-BABA test, is a statistical measure used in population genetics to test for admixture between populations. It analyzes genome-wide patterns of allele sharing across four taxa: two reference populations, a putative admixed population, and an outgroup. The method relies on biallelic sites and distinguishes ancestral and derived alleles to count ABBA and BABA pattern occurrences.

The statistic is computed from the counts of ABBA and BABA site patterns across the genome. Let

Significance is typically assessed using block jackknife or bootstrap approaches over large genomic blocks to estimate

Applications of the D-statistic are widespread in population genetics and evolutionary biology. It has been used

nABBA
denote
sites
showing
the
ABBA
pattern
and
nBABA
denote
sites
showing
the
BABA
pattern.
D
is
defined
as
(nABBA
−
nBABA)
/
(nABBA
+
nBABA).
In
allele-frequency
implementations,
the
statistic
can
be
weighted
rather
than
simply
counted.
Under
a
simple,
non-admixture
tree,
ABBA
and
BABA
patterns
occur
with
equal
frequency
due
to
incomplete
lineage
sorting,
yielding
D
near
zero.
A
significant
deviation
from
zero
suggests
gene
flow
between
the
test
population
and
one
of
the
other
populations
in
the
quartet.
standard
errors
and
Z-scores.
A
|Z|
value
above
common
thresholds
(for
example,
around
3)
indicates
statistical
support
for
admixture.
to
detect
Neanderthal
and
Denisovan
gene
flow
into
modern
humans
and
to
explore
introgression
in
various
species.
Limitations
include
sensitivity
to
outgroup
choice,
incorrect
tree
topology,
data
quality,
and
ancestral
population
structure,
and
it
does
not
quantify
the
proportion
of
admixture,
only
its
presence
or
absence.