Fehlerbudget
Fehlerbudget, or error budget, is a concept used in software engineering and system reliability to manage and allocate acceptable levels of system failures. It is a predefined limit of downtime or errors that a system can tolerate within a given period, often expressed as a percentage of uptime. The idea is to balance the need for system reliability with the desire for innovation and feature development. By setting an error budget, teams can make informed decisions about risk and prioritize work accordingly.
The error budget concept was popularized by Google's Site Reliability Engineering (SRE) team. In their approach,
Using an error budget, teams can make decisions about when and how to deploy changes, test new
Error budgets can also be used to prioritize work and allocate resources more effectively. By understanding
In summary, the error budget is a powerful tool for managing system reliability and balancing the need