decodeutf8
Decodeutf8 refers to the process of converting a sequence of bytes encoded in UTF-8 into a Unicode string. UTF-8 is the dominant encoding for Unicode on the web and in many programming environments, using one to four bytes per character. Decoding UTF-8 involves reading the byte stream, interpreting multi-byte sequences, and producing the corresponding Unicode code points.
Input for a decodeutf8 operation is typically a byte array or binary string. The output is a
Error handling varies. Strict decoders raise errors when encountering invalid UTF-8 sequences, overlong encodings, or code
In practice, decodeutf8 is provided by standard libraries under various names. In Python, for example, a byte
See also: UTF-8, Unicode, encoding, decoding, BOM, replacement character.