Home

percharacter

Percharacter, often written as per-character or character-level, is a term used in computing, linguistics, and typography to denote operations, metrics, or models that operate at the level of individual characters rather than words or subword units.

In text processing and OCR/ASR evaluation, per-character metrics include character error rate (CER) and character-level accuracy.

Character-level models process text as sequences of characters. They can capture spelling and long-range dependencies and

In typography and fonts, per-character information concerns glyph metrics, kerning, and rendering of individual characters. In

Limitations include longer sequence lengths and data sparsity for rare characters, as well as Unicode encoding

CER
measures
mistakes
per
character,
normalized
by
the
reference
character
count.
This
approach
is
common
for
languages
with
rich
morphology
or
non-Latin
scripts,
where
word-level
metrics
may
be
less
informative.
are
trained
on
raw
text
with
minimal
preprocessing.
They
contrast
with
word-level
or
subword
models,
and
can
generalize
to
unseen
words
or
languages
with
limited
word-level
resources.
user
interfaces,
per-character
data
underpins
cursor
movement,
text
selection,
and
accessibility
features,
ensuring
precise
interaction
with
text
at
the
character
level.
challenges.
Hybrid
approaches
that
combine
character-level
signals
with
word-
or
subword-level
representations
are
common
to
balance
granularity
with
efficiency.
See
also
character-level
modeling,
word
error
rate,
and
Unicode
typography.