PagerDuty Blog

Expanding PagerDuty with $10.7M new funding

I’m very happy to announce we’ve just received $10.7M in funding, led by Andreessen Horowitz.  Also participating in the round were Jesse Robbins, founder of Opscode; WIN Funding; and existing investors Baseline, Harrison Metal and Ignition.

We will be using the funding to accelerate product and market development of our IT incident tracking and on-call management platform.  In other words, more money means we’ll hire more crazy-smart engineers to write more features, further raise the bar on our system’s reliability, and ultimately make even more customers happy.

I’ll be perfectly honest: Up to this point, we’ve been flying a bit under the radar.  We haven’t talked very much about our traction and success thus far.  We also haven’t done much in the way of publicity and self-promotion.  Instead, we’ve relied on our customers to spread PagerDuty via word of mouth.  This strategy has actually worked out quite well (as it turns out, when you build a good product that solves a real, hair-on-fire problem, people will pay for it).  Today, we have thousands of customers ranging from large enterprise companies (Microsoft, Adobe, Intuit, EA) to startups (Square, Github, Pinterest, Etsy) and everything in-between.  We’ve come a long way, but we still have a long way to go.

The main reason we’ve raised this new funding round is to accomplish our vision for the product much faster.  At this point, I’m sure you’re wondering “What is the PagerDuty vision?”.  Today, we are the “9-1-1 dispatch” system for IT.  It’s a bit like normal “9-1-1”, which is used to dispatch emergency services — police and ambulance.  Our system dispatches engineers to fix critical issues in your IT infrastructure.

The next major step in the vision is to expand beyond just the critical incidents.  We ultimately will become the central nervous system of IT: we’ll provide the interconnecting fabric between your systems and the people responsible for managing them.  We will continue to focus on solving the people part of IT incident management and leave monitoring for the monitoring guys.  Our big audacious goal is to reduce the noise.  If you think about it, devops teams use multiple monitoring tools, each of which produces a lot of alerts.  Most of these alerts are not critical or high priority.  We want to develop PagerDuty to slurp in all of these monitoring events and increase the signal-to-noise ratio for our users.  In other words, only the critical issues should wake you up at 4am, false alerts should be automatically filtered out, and low priority incidents should be surfaced in aggregate in summary reports.  Of course, there’s a lot more to this, but we can’t give everything away just yet :).

What makes the vision really exciting is that we’re solving an important problem — incident response – a problem that’s never really been solved very well before.  We’re replacing cobbled-together solutions and manual processes, and ultimately helping DevOps engineers resolve issues faster and reduce downtime.  We want to help our users become heroes at their company (and in front of their boss).

From a product perspective, we are really focused on building a system that’s intuitive to use, that doesn’t have a steep learning curve (like many other enterprise software systems and IT tools), and that really meshes with the mantra of “make easy things easy and hard things possible”.

Finally, what really excites us from an engineering perspective is building an extremely reliable, fault-tolerant, distributed system at scale.  Indeed, we have an absolute obsession with reliability.  We believe that even two minutes of downtime is completely unacceptable, and that planned downtime and maintenance windows are no longer acceptable in today’s 24×7 world.  We’ve found that failover architectures aren’t suitable for extreme uptime applications like PagerDuty, and have therefore started converting our message dispatch pipe to use fully distributed data stores like Cassandra and Zookeeper.  Our ultimate goal is to be able to survive the total loss of a data center without any interruption or delay whatsoever to alert deliveries.

If this sounds exciting to you and you want to join us on this path, we are looking for smart reliability engineers, DevOps engineers, and front-end JavaScript experts to help us re-invent the devops tools space.