Home

ldiversity

L-diversity, often written as l-diversity, is a privacy criterion used in data anonymization to prevent attribute disclosure. It extends k-anonymity by requiring that, within every group of records that share the same quasi-identifier values (an equivalence class), there are at least l distinct values for the sensitive attribute.

In practice, data publishers generalize or suppress quasi-identifiers to form equivalence classes. A dataset satisfies l-diversity

Benefits of l-diversity include reducing the risk that an attacker who knows the quasi-identifiers can deduce

Limitations of l-diversity include its vulnerability to skewness and similarity attacks, where the distribution or semantic

if
for
every
class,
the
set
of
values
observed
for
the
sensitive
attribute
contains
at
least
l
distinct
values.
For
example,
in
a
hospital
dataset
with
a
sensitive
attribute
"disease,"
a
class
with
five
patients
all
labeled
with
different
diseases
would
meet
l=3,
while
a
class
where
all
five
have
the
same
disease
would
not.
a
person’s
sensitive
attribute
from
a
small
set
of
possible
values.
It
helps
mitigate
homogeneous
attribute
disclosure
that
can
occur
under
k-anonymity
alone.
relatedness
of
values
can
still
allow
inference
even
when
there
are
multiple
distinct
sensitive
values
in
a
class.
It
may
also
require
substantial
data
generalization
or
suppression,
reducing
data
utility.
More
advanced
models
such
as
t-closeness
and
differential
privacy
address
some
of
these
weaknesses.
See
also
k-anonymity,
t-closeness,
and
differential
privacy.