UnicodeCodepunkten
UnicodeCodepunkten are the numeric identifiers assigned to characters in the Unicode standard. Each code point is an abstract unit used to uniquely identify a character, independent of how that character is stored or rendered. They are written in the form U+XXXX, with additional digits for higher code points (for example, U+1F600).
The Unicode code space ranges from U+0000 to U+10FFFF and is divided into 17 planes. The Basic
Code points are encoded into bytes by encoding forms such as UTF-8, UTF-16, and UTF-32. UTF-8 uses
In source code and data formats, UnicodeCodepunkten are referenced directly or via escapes, for example U+0041
Understanding code points is essential for text processing, normalization, rendering, and internationalization, and clarifies the distinction
See also Unicode, code point concept, surrogate pair, UTF-8, UTF-16, UTF-32.