SilhouetteScore

Silhouette score is a metric used to evaluate the quality of a clustering result. For a given labeling of data into clusters, it assigns each sample a silhouette coefficient between -1 and 1. The coefficient reflects how similar the sample is to its own cluster compared with points in the nearest neighboring cluster. A higher score indicates better clustering, while negative values suggest possible misassignment.

For a sample i in cluster A, a(i) is the average distance from i to all other

Computation typically uses a distance metric, most commonly Euclidean distance, but any metric supported by the

Limitations include sensitivity to density variations and to clusters of different sizes, and it may be less

i

i

=

-

/

1

0

misclassifications.

a

a

k

k

a

Implementations

silhouette_score

silhouette_samples.