Home

ZINB

ZINB refers to the zero-inflated negative binomial distribution, a two-part model designed for count data that show both overdispersion and excess zeros. It combines a binary process that yields additional zeros with a negative binomial process that governs positive counts.

In formal terms, let Y be a nonnegative integer. With probability pi, the observation is an extra

- P(Y = 0) = pi + (1 - pi) NB(0; r, p)

- P(Y = y) = (1 - pi) NB(y; r, p) for y = 1, 2, ...

Here NB(y; r, p) is the negative binomial probability, often parameterized by shape r and success probability

Estimation and use

Parameters pi, r (or mu, k) are typically estimated by maximum likelihood or Bayesian methods. Zero-inflation

Applications and context

ZINB models are commonly applied to ecological and epidemiological data, genomics (notably single-cell sequencing), and insurance

zero.
With
probability
1
-
pi,
the
observation
follows
a
negative
binomial
distribution.
Thus
the
probability
mass
function
is:
p,
given
by
NB(y;
r,
p)
=
Gamma(y
+
r)
/
(Gamma(r)
y!)
*
p^r
*
(1
-
p)^y.
An
alternative
is
to
use
mean
mu
and
dispersion
k,
with
NB(y;
mu,
k)
reflecting
the
same
distribution
in
a
different
parameterization.
(pi)
can
be
modeled
separately
from
the
mean
of
the
NB
component,
allowing
covariates
to
influence
both
the
probability
of
extra
zeros
and
the
expected
count.
This
leads
to
ZINB
regression
models,
useful
when
covariates
affect
two
separate
data-generating
processes.
claims,
where
there
are
many
zeros
and
overdispersed
counts.
They
are
often
contrasted
with
hurdle
models;
model
choice
depends
on
whether
zeros
are
generated
by
a
separate
process
or
by
the
same
mechanism
governing
counts.
Potential
challenges
include
identifiability
issues
and
sensitivity
to
model
mis-specification.