overflatematching
Overflatematching is a term used in record linkage and data integration to describe the erroneous practice of identifying two distinct records as the same entity. It occurs when similarity criteria are too broad, attributes lack sufficient distinguishing power, or thresholds are set too leniently. The result is an inflated set of duplicates or merged records that actually refer to different entities.
Causes of overflatematching include relying on non-unique or common attributes (such as gender and approximate birth
Consequences include distorted counts of individuals or entities, biased analytics, degraded data quality, and incorrect histories
Mitigation strategies emphasize precision over recall. Approaches include probabilistic matching with calibrated thresholds, using multiple independent
See also: record linkage, deduplication, probabilistic matching, data quality.