Tag Archives: Availability

Not breaking your Google Analytics (like a pro)

As a general rule, whatever percentage you think your test coverage is, it isn’t. Whatever amount of the known surface area you’re covering, there’s going to be an exciting swath of things you didn’t realize that you need to test. … Continue reading

Posted in Best Practices | Tagged , , , | Leave a comment

Pressure Release Valves

This is the fourth in a series of posts on increasing overall availability of your service or system. Have you ever gotten paged, and known right away that this problem isn’t like the last 15 operations issues you’ve dealt with … Continue reading

Posted in Availability | Tagged , | Leave a comment

A Standard Operating Procedure for when s*IT hits the fan

This is the third in a series of posts on increasing overall availability of your service or system. In the first postĀ of this series, we defined and introduced some concepts of system availability, including mean time between failure – MTBF … Continue reading

Posted in Availability | Tagged , | Leave a comment

Availability lessons from shoe companies and ancient warlords

This is the second in a series of posts on increasing overall availability of your service or system. In the first post of this series, we defined and introduced some concepts of system availability, including mean time between failure – … Continue reading

Posted in Availability | Tagged , | 1 Comment

The ups and downs of Availability

This post is meant as a quick introduction to some concepts of system availability, so that subsequent posts in this series make sense. I’ll go over concepts like availability, SLA, mean time between failure, mean time to recovery, etc. Continue reading

Posted in Availability | Tagged , , , | 5 Comments