Home

casefold

Casefold is a process used to convert text into a form that can be compared in a case-insensitive way. It is based on Unicode case folding and is intended for caseless matching across different scripts and languages. Case folding is more expansive than simple lowercasing, because it may map certain characters to multiple code points or expand ligatures into their component letters.

In programming, casefolding is typically exposed as a function or method that returns the folded form of

Examples illustrate its behavior but are not meant to preserve the original appearance of text. For instance,

Applications include search, matching, and normalization tasks where case should be ignored. Casefolding is designed for

See also: Unicode Case Folding, Unicode Standard, Unicode Technical Standard UAX 21.

a
string.
It
is
designed
for
robust
text
comparisons
and
hashing,
where
language-agnostic
matching
is
important.
Casefold
differs
from
lowercase
in
that
it
uses
a
broader
Unicode
folding
mapping;
for
example,
the
German
letter
ß
is
folded
to
"ss,"
and
certain
ligatures
may
be
expanded
to
multiple
characters.
the
string
containing
the
German
ß
may
become
"ss"
after
casefolding,
and
ligatures
such
as
some
typographic
forms
can
be
expanded
to
their
constituent
letters.
The
result
is
intended
to
enable
reliable
caseless
comparisons
across
languages
and
scripts
rather
than
to
produce
a
visually
canonical
form.
stable,
language-agnostic
comparisons,
though
it
is
not
reversible
in
general.
It
is
defined
in
Unicode
as
part
of
the
Unicode
Case
Folding
specification
and
is
implemented
in
languages
and
libraries
that
support
Unicode-aware
text
processing,
including
Python’s
str.casefold().