Home

duplico

Duplico is a term used in information technology to describe a family of data duplication and deduplication techniques designed to improve storage efficiency and data transfer in distributed systems. The concept combines content-defined chunking, cryptographic hashing of chunks, and a reference-counted content-addressable storage model to store only unique data blocks, with duplicates represented as lightweight references.

Etymology of the term derives from the Latin duplico, meaning to double or duplicate, reflecting its core

Origins and standardization: Duplico originated in academic and industry collaborations in the early 2010s, with early

Architecture and operation: A typical Duplico deployment comprises a Duplico Core engine on storage nodes, a

Applications and impact: Duplico is used in cloud backups, archival storage, container image distribution, and large-scale

See also: deduplication, content-addressable storage, chunking, replication.

goal
of
reducing
redundancy
across
data
sets.
implementations
appearing
in
backup
software.
A
formal
open
specification
was
proposed
by
the
Duplico
Consortium
in
2018
and
later
adopted
by
several
cloud
storage
platforms
in
the
2020s,
though
interoperability
remains
partial
across
vendors.
Duplico
Client
library
integrated
into
applications,
and
a
Duplico
Index
that
tracks
chunk
hashes
and
reference
counts.
Data
flow
begins
when
an
application
writes
data;
the
Core
splits
it
into
chunks,
computes
hashes,
consults
the
index,
stores
only
unseen
chunks,
and
issues
references
to
previously
stored
blocks
for
duplicates.
Data
can
be
encrypted
at
rest
with
per-chunk
keys,
and
access
control
is
defined
at
the
metadata
layer.
file
systems.
Benefits
include
reduced
storage
requirements
and
lower
network
bandwidth
for
replication;
drawbacks
include
CPU
overhead,
metadata
scaling
considerations,
and
potential
vendor
lock-in.