Home

overdispersed

Overdispersed describes a situation in which the observed variability in a dataset exceeds what a chosen statistical model expects, most commonly in Poisson models for count data. Under a Poisson distribution, the mean and variance are equal, so when Var(Y) > E(Y), the data are said to be overdispersed. In binomial settings, dispersion is typically assessed relative to p(1−p), and overdispersion can occur when observed variance exceeds this value.

Common causes include unobserved heterogeneity among units, clustering or dependence among observations, time-related effects, measurement error,

Overdispersion affects statistical inference by making standard errors too small and hypothesis tests overly optimistic when

Diagnostics for overdispersion involve assessing the dispersion statistic, often the ratio of Pearson chi-square to degrees

and
zero-inflation
(excess
zeros).
Model
misspecification
or
omitted
covariates
can
also
produce
apparent
overdispersion.
Different
mechanisms
may
operate
in
different
datasets,
sometimes
simultaneously.
Poisson
or
binomial
assumptions
are
used.
Remedies
include
using
models
that
incorporate
a
dispersion
parameter,
such
as
quasi-Poisson
or
negative
binomial
models
for
count
data,
which
allow
variance
to
exceed
the
mean.
Alternative
approaches
include
robust
(sandwich)
standard
errors,
zero-inflated
or
hurdle
models
for
excess
zeros,
and
mixed-effects
models
to
account
for
clustering.
of
freedom,
and
the
deviance/df.
Other
tools
include
residual
analysis
and
formal
tests
(for
example,
tests
attributed
to
Cameron
and
Trivedi).
Model
comparison
using
information
criteria
(AIC/BIC)
and
checking
whether
alternative
specifications
reduce
dispersion
help
determine
appropriate
modeling
choices.