Home

Surprisal

Surprisal, also known as self-information, is a measure of how unexpected a particular outcome is. For an event with probability p, the surprisal is I(p) = -log_b p, where log_b denotes the logarithm with base b. The unit is bits if base 2 is used, nats for base e, and so on. Surprisal is nonnegative and becomes larger as p decreases; it is zero when p = 1 and grows without bound as p approaches zero.

For a random variable X with distribution p(x), the average surprisal, called entropy, is H(X) = E[I(X)] =

In conditional form, the surprisal of an event given a context is I(x|context) = -log_b p(x|context). This

Applications include data compression, coding theory, and psycholinguistics. Surprisal is widely used to quantify how expected

-
sum_x
p(x)
log_b
p(x).
Entropy
is
the
expected
information
content
of
X.
The
surprisal
of
independent
events
is
additive:
the
surprisal
of
a
joint
outcome
equals
the
sum
of
the
individual
surprisals.
In
coding
terms,
the
average
code
length
cannot
be
less
than
the
entropy.
conditional
surprisal
underpins
language
modeling
and
cognitive
studies
of
predictability:
more
surprising
words
(lower
conditional
probability)
have
higher
surprisal
and
are
often
associated
with
longer
processing
times
in
reading
experiments.
or
unexpected
a
piece
of
information
is
within
a
given
probabilistic
model.