dissimilarities
Dissimilarity is a quantitative measure of how different two objects are. Formally, a dissimilarity (or distance) is a function d that assigns a nonnegative real number to a pair of objects from a given domain, with d(x, x) = 0. If a dissimilarity also satisfies symmetry d(x, y) = d(y, x) for all x and y, and, in the stricter case, the triangle inequality d(x, z) ≤ d(x, y) + d(y, z), it is called a metric. Not all dissimilarities satisfy all metric properties; in data analysis the term distance is often used loosely.
Different data types necessitate different measures. For binary or categorical data, the Hamming distance counts mismatches,
Properties and considerations are important when selecting a dissimilarity. The choice influences clustering, classification, and visualization
Applications of dissimilarities include clustering, nearest-neighbor methods, anomaly detection, and information retrieval. Related concepts include similarity,