Hi
My name is Rachel and I am the Engineering Manager for the Scalability Team in the Infrastucture Department.
I'd like to talk you through the recent addition of Error Budgets. [ Ссылка ]
---
Part of our goal in the Scalability Team is to enable Development teams to understand how their feature categories perform on GitLab.com. To do this, we introduced the stage group dashboards which are introduced in a separate video.
These provide a lot of information about your feature categories. And these dashboards show how reliable and performant your code is.
But it can be hard to know how much time a team should spend on reliability and performance. And how to balance this with feature work.
So we have introduced Error Budgets as a way to provide data to help with these prioritization decisions.
---
Error budgets provide a single number expressed as a percentage to show how a group of feature categories are performing on GitLab.com.
You can see your stage group's Error Budget through your stage group dashboard. For example: [ Ссылка ] this shows the Source Code Error Budget.
Error budgets are made up of two parts: Application Performance Index (or Apdex) and Error Rate. [ Ссылка ]
For Apdex, we count how many requests were executed successfully and how many completed in a time that is acceptable for that specific endpoint.
For Error Rate, we count how many requests were received and how many completed without error.
Every request is attributed to a stage group, and we sum up these totals to produce an Error Budget. The formula is shown in this handbook page: [ Ссылка ]
An Error Budget looks back over the previous 28 days to show how the feature categories performed. The choice to use 28 days was deliberate. This was the best way to capture trends and account for weekly fluctuations. 28 days also aligns with the Product Development Timeline. [ Ссылка ]. The recommendation is to review the stage group Error Budget 14 days before a milestone begins to determine how to balance feature work with reliability work. The budget spends are reported monthly on the 4th. [ Ссылка ]
The target is set at 99.95% (which matches the targets the Infrastructure department has). The significance of a shared target means that we have a consistent language and consistent goals. These targets also ensure that the infrastructure we rely on is able to meet the needs of the application. Every team below this target places a strain on the infrastucture resources that we share.
If your stage group drops below the target, you should investigate the budget failures shown on your dashboard.
---
We will continue to iterate on how Error Budgets work and how you can control the metrics recorded for each endpoint. We recognize that different features have different needs, and we are working to better cater for this diversity.
---
We hope you find it helpful to see how your code performs on GitLab.com. And we hope this system provides data to support you in your teams prioritization activities.
Thank you for watching.
Ещё видео!