Home

specialcasing

Specialcasing refers to a set of rules for transforming the case of letters that depend on language, locale, or context, rather than applying a single, universal mapping. In Unicode and related text processing systems, special casing handles exceptions to the general uppercase-lowercase conversions and is defined in data files such as SpecialCasing.txt. These rules are used when proper orthography and phonology require locale-aware transformations.

Common examples include Turkish and Azerbaijani: the uppercase of the letter i is İ (I with a dot)

Applications of special casing include correct display in typography, accurate search and indexing, and locale-aware sorting

and
the
lowercase
of
I
is
ı
(dotless
i).
Because
of
this,
simple,
language-agnostic
case
conversion
can
produce
incorrect
results
in
these
languages,
so
special
casing
provides
the
correct
mappings
for
each
locale.
Another
well-known
case
concerns
German,
where
the
lowercase
ß
maps
to
SS
when
uppercased,
and,
in
some
contexts,
a
distinct
capital
form
ẞ
(capital
sharp
S)
is
recognized
in
Unicode.
Greek
also
has
context-dependent
rules
for
the
letter
sigma:
the
lowercase
form
σ
is
used
inside
words,
while
final-position
forms
ς
appear
at
the
end
of
words,
with
corresponding
uppercase
mappings
to
Σ.
and
comparison.
It
complements
general
case
folding
and
locale-sensitive
operations
in
programming
languages
and
libraries
(such
as
Unicode,
ICU,
and
language‑specific
runtimes)
to
ensure
linguistically
correct
and
visually
appropriate
text
transformations.
Limitations
arise
from
the
diversity
of
languages
and
evolving
orthographic
standards,
so
implementations
must
reference
up-to-date
data
like
the
Unicode
SpecialCasing
rules.