Home

KRIMP

KRIMP is a data mining algorithm that applies the minimum description length (MDL) principle to identify a compact, interpretable set of patterns that summarize a transactional database. It is used for tasks such as frequent pattern mining, data compression, and anomaly detection.

The core idea is to build a code table consisting of selected patterns (typically itemsets) and assign

The resulting code table provides a compact summary of the database in terms of representative patterns. This

Limitations and considerations include dependence on the quality of candidate patterns and the search strategy, potential

a
code
to
each.
A
transaction
is
encoded
by
choosing
a
covering
of
its
items
with
patterns
from
the
code
table.
The
total
description
length
equals
the
sum
of
the
code
table’s
description
length
plus
the
description
length
of
the
data
given
the
code
table.
KRIMP
searches
for
the
code
table
that
minimizes
this
total
length,
usually
by
greedy
or
heuristic
search
over
candidate
itemsets.
By
favoring
concise
representations,
KRIMP
discourages
overfitting
and
yields
a
compact
summary
of
the
data.
summary
can
be
used
for
further
mining
tasks,
such
as
identifying
frequent
itemsets,
clustering
transactions
by
their
encoding,
and
detecting
outliers
(transactions
that
are
poorly
compressed).
The
method
emphasizes
interpretability,
since
the
patterns
in
the
code
table
are
explicit
itemsets.
computational
intensity
for
large
datasets,
and
sensitivity
to
parameter
choices
and
initial
pattern
sets.
KRIMP
is
primarily
applied
to
binary
transaction
data
and
may
require
discretization
or
adaptation
for
continuous
data.
Variants
and
related
MDL-based
approaches
have
extended
the
idea
to
broader
pattern
mining
and
data
summarization
tasks.