kMeans - Infinite Lexicon - Infinite Lexicon

kMeans

K-means is a widely used unsupervised learning algorithm that partitions a dataset into k non-overlapping clusters. It seeks to minimize the within-cluster sum of squares, producing compact, roughly spherical groups when the data are suitable. The method operates on numeric feature vectors and is sensitive to feature scaling; standardization is typically recommended.

Algorithm steps begin with choosing k initial centroids, often at random or with smarter initialization such

a

a

k

n

d

t

k

d

k

initialization,

(within-cluster