Home

checkpointrestore

Checkpoint/restore, often written checkpoint/restore, is a technique for saving the complete state of a running computation or execution environment to durable storage so that it can be restored and resumed at a later time. It is used to provide fault tolerance, support long-running workflows, and enable live workload migration.

A checkpoint typically captures the memory contents, processor state, and the state of runtime resources such

Common domains include process and container management as well as virtual machines. In Linux user space, tools

Applications include high-performance computing, database systems, and cloud platforms where paused workloads must be resumed without

Challenges include handling non-deterministic state, external resources, and network connections; interacting with kernels and device drivers;

See also: checkpoint, process migration, live migration, fault tolerance, persistence.

as
open
file
descriptors,
sockets,
and
in-flight
I/O.
Restoring
reconstructs
the
execution
context
and
reestablishes
resources.
Checkpointing
can
be
performed
in
full,
recording
all
state,
or
incrementally,
recording
only
changes
since
the
previous
checkpoint.
Coordinated
checkpoints
require
application-wide
agreement
on
a
consistent
save
point,
while
uncoordinated
approaches
save
independently
and
rely
on
recovery
mechanisms
to
resolve
inconsistencies.
like
CRIU
(Checkpoint/Restore
In
Userspace)
enable
checkpointing
and
restoring
of
individual
processes
and
containers,
supporting
live
migration
and
fault-tolerance
workflows.
DMTCP
is
another
user-space
checkpointing
framework
for
multi-threaded
applications.
For
virtual
machines,
hypervisors
such
as
QEMU/KVM
implement
live
migration
by
transferring
a
VM’s
state
to
another
host.
data
loss.
Checkpointing
also
serves
as
a
maintenance
mechanism
to
pause
and
resume
long-running
computations
during
upgrades
or
hardware
changes.
ensuring
portability
across
architectures;
and
managing
the
performance
overhead
and
security
implications
of
saving
and
restoring
state.