Home

CountMin

Count-Min Sketch, sometimes written as CountMin Sketch, is a probabilistic data structure used to estimate the frequencies of items in a data stream while using sublinear memory. It provides fast, approximate counts with formal error guarantees and is widely applied in streaming analytics, network monitoring, and large-scale databases.

The structure consists of a two-dimensional array of counters with d rows and w columns. Each row

Space and time complexity: the data structure uses O(w · d) counters. Update and query operations run

Variants and extensions include conservative updates to reduce overestimation and the ability to support deletions in

Applications include network traffic analysis, query frequency estimation in large databases, and real-time analytics where exact

History: Count-Min Sketch was introduced by Graham Cormode and S. Muthukrishnan in 2005.

has
an
independent
hash
function
that
maps
an
item
to
a
column
index.
To
update
an
item,
the
corresponding
counters
in
all
d
rows
are
incremented.
To
estimate
an
item's
frequency,
the
values
at
the
d
mapped
positions
are
read
and
the
minimum
of
these
values
is
returned.
Because
different
items
can
hash
to
the
same
positions,
estimates
can
overcount;
taking
the
minimum
across
rows
mitigates
this
effect.
in
O(d)
time.
By
choosing
w
and
d
appropriately—typically
w
≈
e/ε
and
d
≈
ln(1/δ)—one
can
bound
the
overestimation
of
any
item
by
ε
times
the
total
number
of
updates
with
probability
at
least
1
−
δ.
counting,
as
well
as
other
sketch
families
that
trade
accuracy
for
different
guarantees.
Limitations
include
reliance
on
quality
hash
functions
and
the
fact
that
exact
counts
are
not
recoverable;
estimates
may
be
biased
upward
in
the
presence
of
many
collisions.
counting
is
impractical.