Home

CP65001

CP65001 is the Windows code page number for UTF-8. In Windows terminology, a code page identifies how a byte string should be interpreted as characters. CP65001 maps to the UTF-8 encoding, making UTF-8 the Windows-identified ANSI encoding for applications that rely on code pages to convert between multibyte and wide-character representations.

In practice, CP65001 is used by certain Windows APIs and applications that operate on multibyte strings. Functions

CP65001 has historically presented compatibility challenges. Some older or poorly implemented software relied on specific non-Unicode

Best practices suggest using Unicode APIs directly where possible and treating CP65001 as a bridge for UTF-8

such
as
MultiByteToWideChar
and
WideCharToMultiByte
can
use
CP65001
as
the
source
or
target
encoding,
enabling
conversion
between
UTF-8
and
UTF-16
(the
internal
Windows
Unicode
encoding).
The
current
ANSI
code
page
for
a
process
can
be
queried
with
GetACP
and
changed
for
the
console
with
SetConsoleCP
and
SetConsoleOutputCP,
though
changing
the
console
code
page
can
affect
how
text
is
rendered
and
interpreted.
code
pages
and
may
fail
or
produce
incorrect
results
when
switched
to
CP65001.
Early
Windows
versions
also
experienced
reliability
issues
with
UTF-8
handling
in
certain
API
paths,
especially
in
console
I/O
and
in
files
or
registry
data
that
were
not
strictly
valid
UTF-8.
Over
time,
support
improved
in
modern
Windows
releases,
but
developers
still
encounter
edge
cases
when
mixing
UTF-8
with
legacy
code
paths
or
with
non-Unicode
APIs.
data
rather
than
a
primary
internal
encoding.
When
using
CP65001,
validate
UTF-8
input,
handle
potential
conversion
errors
gracefully,
and
be
mindful
of
console
and
filesystem
interactions
that
depend
on
the
active
code
page.
See
also
UTF-8,
Unicode,
and
the
concept
of
ANSI
code
pages
in
Windows.