Home

Multiversion

Multiversion is a design principle in data management and software systems that maintains multiple historical versions of an item to support concurrent access, historical queries, and non-blocking reads. The term is most commonly associated with multiversion concurrency control (MVCC) in database systems.

In MVCC, each data item can have several versions, each associated with a version identifier such as

To manage storage, systems employ a garbage collection or version-pruning process to remove obsolete versions that

Advantages of multiversion approaches include non-blocking reads, finer-grained isolation, and support for time-travel or historical queries.

Challenges include increased storage overhead, the need for sophisticated garbage collection, potential complexity in conflict resolution,

a
transaction
timestamp
or
a
logical
clock.
When
a
transaction
starts,
it
views
a
consistent
snapshot
corresponding
to
its
version
identifier,
even
as
other
transactions
write
new
versions.
Writes
create
new
versions
rather
than
overwriting
existing
ones,
enabling
readers
to
proceed
without
waiting
for
writers
in
many
cases.
This
approach
reduces
locking
and
can
improve
read
throughput
and
support
long-running
reads.
are
no
longer
visible
to
any
active
transaction.
The
exact
rules
for
visibility
and
cleanup
depend
on
the
implementation
and
the
chosen
consistency
guarantees.
They
are
widely
used
in
relational
databases
such
as
PostgreSQL
and
Oracle,
as
well
as
in
various
NoSQL
systems
and
distributed
data
stores
that
prioritize
concurrent
access.
and
ensuring
correct
visibility
across
distributed
components.
Proper
tuning
and
workload
understanding
are
important
to
reap
the
benefits
of
multiversion
techniques.