Blog

Cut Through the Chaos With PagerDuty Event Intelligence

by David Shackelford June 7, 2018 | 4 min read

Across every industry and operational model we serve, customers tell us they’re struggling with finding the actionable signal in a sea of data. The systems and services teams run are growing more complex every year, and headcount never scales at the same rate.

This means the volume of telemetry organizations must deal with is no longer manageable using existing methods—many enterprises deal with thousands, sometimes even millions, of events per day. Several organizations have told us that when a major incident hits, their responders have to turn off their phones to prevent an alert storm from blowing them up with duplicate notifications. This is annoying and distracting, especially when the stakes are extremely high—but worse, it also makes it impossible to quickly identify the actual issue. And for the business, it translates into lost resolution time and additional risk.

From Event Management to Event Intelligence

Reducing noise has always been part of PagerDuty’s mission, and our platform has accomplished this through automating on-call scheduling and escalations, supporting effective collaboration and incident response, and providing reporting and insights—all in a way that empowers teams to own their own destinies.

But now, we’re going further, with a new product that gives your team superpowers to handle the growing torrent of signals from all your tools and infrastructure.

Event Intelligence tackles many of the universal problems in the event management world, including collecting signals from all your tools, suppressing noise, correlating actionable alerts, and getting that information to responders. But it does so in a new, unique way, by fusing system and human data to dial down the noise, focus your response, and empower your team.

Intelligent Alert Grouping was born from a simple insight: There’s a lot you can do with rich data from your systems—but just as important (maybe more) is what responders do with that data. Infrastructure scales and changes, teams spin up new services that interact in unpredictable ways, and traditional command-and-control approaches just can’t keep up.

But by looking at how users on a team interact with their operational issues and learning from that behavior over time, we can effectively correlate alerts and cut through the noise even as the system grows and changes, saving customers immense time and money and enabling their responders to focus on higher-level, more impactful work.

 

Once your alerts are correlated down to an actionable incident, it’s time to respond. Similar Incidents looks back through an account’s response history for incidents related to the current one, using data science to put the exact right context at responders’ fingertips. Responders can easily tell whether an incident is a routine blip or a potentially dangerous anomaly, and view notes and other metadata from past incidents to help with triage. By seeing patterns in operational issues that only show up in aggregate, responders are more confident and more effective—and save precious time when it matters most.

“Similar incidents is like having an extra responder on the team.” –Corey Burke, Dialpad

Behind the scenes, Advanced Event Automation filters, enriches, and prioritizes your signals, ensuring nothing unnecessarily notifies a human—and that the signals that do come through include all the right context, such as runbooks and remediation information.

We previewed many of these capabilities at last year’s PagerDuty Summit and have received great feedback from hundreds of early-access customers. They’ve told us that Event Intelligence has replaced manual triage processes, improved their responders’ quality of life, and saved them countless hours of configuration and maintenance. And looking across our customers using these capabilities, we’ve seen overall noise reduction of 98 percent as signals are filtered, suppressed, and intelligently correlated.

Try It Today

Now we’re excited to bring Event Intelligence to all of our customers. To get started, reach out to your PagerDuty representative today or sign up for a free trial.