UTF32LE - Infinite Lexicon - Infinite Lexicon

UTF32LE

UTF-32 Little‑Endian (UTF32LE) is a Unicode character encoding that represents each character with a 32‑bit code unit, arranged in little‑endian byte order (least significant byte first). Each Unicode code point from the Basic Multilingual Plane and supplementary planes is encoded in a single, fixed‑length unit, which eliminates the need for surrogate pairs used in UTF‑16. Because the size of each code unit is constant, indexing and random access to characters are straightforward, making UTF32LE useful in situations where performance and simplicity of access outweigh memory efficiency.

UTF32LE does not require a byte‑order mark (BOM) to indicate endianness; the file format itself implies little‑endian

Memory overhead is a drawback: UTF32LE uses four bytes per character, quadrupling the storage size compared

interoperability

a

a