David Mah, Dropbox
At Dropbox, we’ve worked incredibly hard to build infrastructure that we are confident in trusting. A major aspect of our confidence comes from the verification of our data at rest, which gives us signal that our data will be properly usable when requests actually come in.
In this talk, we’ll break down the thinking about how to design and build a consistency checker system. We’ll start with the actual needs/goals of such a system, then follow with the sub-components of the system. We’ll include both distributed system design AND how to design your alert escalation workflow to be as simple as possible for human operators.
Attendees are expected to leave the session understanding how they could build consistency checkers for their own systems. This includes:
- Do you even need a consistency checker?
- What independent components need to exist?
- What is a good alerting + triaging workflow?
- What is involved in an auto-remediation mechanism for constraint failures
Sign up to find out more about SREcon at [ Ссылка ]
Ещё видео!