Home

KMP

KMP, or Knuth–Morris–Pratt algorithm, is a string-search algorithm that finds occurrences of a pattern P within a text T in O(n + m) time, where n is the length of T and m is the length of P. It achieves this efficiency by using information gathered during a preprocessing stage of the pattern to avoid unnecessary comparisons.

The preprocessing step computes the longest proper prefix that is also a suffix (the LPS array) for

During the search, the algorithm compares P and T from left to right. When characters match, both

KMP is widely used for exact substring search in applications such as text processing and bioinformatics. It

the
pattern.
For
each
position
i,
lps[i]
stores
the
length
of
the
longest
prefix
of
P
that
is
also
a
suffix
of
P[0..i].
This
table
guides
how
far
to
shift
the
pattern
when
a
mismatch
occurs,
so
previously
matched
characters
do
not
need
to
be
rechecked.
indices
advance.
On
a
mismatch,
the
pattern
is
shifted
based
on
the
lps
value,
and
the
next
comparison
uses
already
matched
information.
As
a
result,
each
character
of
the
text
is
examined
at
most
once,
and
the
total
number
of
pattern
examinations
is
bounded,
yielding
linear
time.
requires
O(m)
extra
space
for
the
lps
array
and
works
on
any
alphabet.
Variants
exist
for
multi-pattern
search
and
for
approximate
matching,
but
the
classic
form
focuses
on
exact
matching.
The
algorithm
was
introduced
in
1977
by
Donald
Knuth,
Vaughan
Pratt,
and
James
Morris.