Testsum
Testsum is a scalar metric used to summarize the outcomes and significance of a set of software tests. It aggregates per-test results into a single number that can be tracked over time, reported on dashboards, or used in release decisions. The concept is a design choice for teams seeking a compact indicator that accounts for both test outcomes and the relative importance of tests.
Computation is defined for a suite with tests i = 1 to n. Each test has an outcome
Variants and extensions include time-based decay, which gives more weight to recent tests, and alternative scoring
Interpretation of results: A higher Testsum indicates greater passing coverage of important tests; a lower value
Limitations: The metric depends heavily on the chosen weights and scoring rules, and may mask flaky tests