BCubed
BCubed is a family of evaluation metrics for clustering and classification that are designed to handle overlapping clusters. It was introduced by Pushpak Bagga and Andrew Baldwin in 1998 to evaluate entity-based cross-document clustering. BCubed defines precision and recall at the item level and then aggregates these scores across all items to produce a single overall measure, typically reported as an F1 score. A defining feature of BCubed is its ability to accommodate items that belong to multiple clusters or multiple true categories.
To compute BCubed precision, for each item i, consider all predicted clusters that contain i. For each
BCubed is widely used in information retrieval and natural language processing to evaluate document clustering, named-entity