TimeoutRate
TimeoutRate is a metric used in reliability engineering to quantify the fraction of operations that do not complete within a predefined timeout. It is applicable to distributed systems, networks, databases, message queues, and client applications, where timeouts are a common failure mode.
Definition and calculation: TimeoutRate is typically defined as the number of timeouts divided by the total
Measurement considerations: TimeoutRate can be measured per endpoint, service, or client type. Timeouts may originate from
Uses and interpretation: TimeoutRate helps detect latency or capacity issues, monitor SLA compliance, and inform configuration
Limitations: Timeouts are influenced by both client and server behavior; the metric alone does not diagnose
Related metrics: latency (e.g., p95), success rate, error rate, and overall availability.