Home

casefolding

Case folding is a text processing technique used to enable case-insensitive comparisons across Unicode text. It transforms strings according to the Unicode case folding rules rather than simple lowercasing, producing a form suitable for matching. Unlike ordinary lowercasing, case folding is language-neutral and may produce multi-character mappings. For example, the German character ß (eszett) maps to "ss" in case folding, and the string "Straße" case-folds to "strasse". Some characters may map to different lengths or sequences, and not all language-specific distinctions are preserved. The aim is to provide caseless equality across scripts and alphabets rather than to generate user-visible text.

This operation is defined by the Unicode Standard, notably Unicode Technical Report #36 and the casefolding

In practice, many programming environments offer a case-folding function or method; for instance, Python provides str.casefold()

data.
It
is
widely
used
in
search
indexing,
database
querying,
and
other
algorithms
where
case-insensitive
matching
is
required
across
multiple
languages
and
scripts.
to
perform
Unicode-aware
case
folding.
Implementations
may
differ
in
edge
cases,
and
case
folding
does
not
replace
locale-aware
lowercasing
for
display
or
sorting.
It
should
be
chosen
when
the
task
is
to
determine
whether
two
strings
are
equivalent
under
case-insensitive
comparison
rather
than
to
show
normalized
text.