Home

Serializations

Serialization is the process of converting an in-memory data structure or object into a format that can be stored on disk, transmitted over a network, or otherwise persisted. Deserialization is the reverse operation. Formats are broadly categorized as text-based or binary. Text-based formats such as JSON, XML, and YAML are human-readable, while binary formats such as Protocol Buffers, Avro, Thrift, and MessagePack emphasize compactness and speed and often rely on a predefined schema.

Some formats are self-describing, meaning the data includes enough information to interpret its structure (as JSON

Serialization is used for data persistence, inter-process communication, remote procedure calls, caching, and messaging. Designers must

Common terms include marshal/unmarshal, pickling in Python, and Java's built-in serialization. Developers may choose streaming serializers

or
XML).
Others
are
schema-based,
requiring
an
external
definition
to
encode
and
decode
data
(as
Protocol
Buffers,
Avro,
or
Thrift).
Schema
evolution
rules
determine
how
data
written
with
one
version
of
a
schema
can
be
read
by
another,
affecting
backward
and
forward
compatibility.
balance
readability,
interoperability,
performance,
and
durability.
Security
concerns
include
deserialization
vulnerabilities
when
processing
untrusted
data;
validation,
integrity
checks,
and
least-privilege
processing
are
recommended.
to
process
large
data
or
partial
objects.
Cross-language
compatibility
and
tooling
support
influence
format
choice.