Tag Archives: MTTR

Pressure Release Valves

This is the fourth in a series of posts on increasing overall availability of your service or system. Have you ever gotten paged, and known right away that this problem isn’t like the last 15 operations issues you’ve dealt with … Continue reading

Posted in Availability | Tagged , | Leave a comment

A Standard Operating Procedure for when s*IT hits the fan

This is the third in a series of posts on increasing overall availability of your service or system. In the first postĀ of this series, we defined and introduced some concepts of system availability, including mean time between failure – MTBF … Continue reading

Posted in Availability | Tagged , | Leave a comment

Availability lessons from shoe companies and ancient warlords

This is the second in a series of posts on increasing overall availability of your service or system. In the first post of this series, we defined and introduced some concepts of system availability, including mean time between failure – … Continue reading

Posted in Availability | Tagged , | 1 Comment

The ups and downs of Availability

This post is meant as a quick introduction to some concepts of system availability, so that subsequent posts in this series make sense. I’ll go over concepts like availability, SLA, mean time between failure, mean time to recovery, etc. Continue reading

Posted in Availability | Tagged , , , | 5 Comments