Postdeduplication
Postdeduplication refers to the process of identifying and removing redundant data after it has already been stored. This is in contrast to in-line deduplication, where data is analyzed and redundant blocks are identified and eliminated before they are written to storage. In postdeduplication, data is initially stored without deduplication, and a separate process runs later to scan the stored data, find identical chunks, and replace subsequent occurrences with pointers to the first instance.
The primary benefit of postdeduplication is that it does not introduce latency during the initial data write
There are several approaches to implementing postdeduplication. Some systems perform a full scan of all data,