Blog

Ready for Anything with the PagerDuty Operations Cloud

by Dormain Drewitz June 7, 2022 | 8 min read

In a world of digital everything, teams face increasing complexity. Ever-growing dependencies across systems and processes put customer and employee experience, not to mention revenue, at risk. There is simply too much data to sift through and correlate for humans to understand what is important and know when something is going wrong. 

To be ready for anything in light of this increasing digital complexity and dependencies, operations must transform from manual, rigid, and ticket queue-based, to a continuously improving system that allows focus on customer experience, delivers operational speed AND resilience, and is heavily automated and augmented by machine learning and AI. Only then can teams move toward a more proactive posture to reduce the burden of manual toil, avoid burnout, and preserve focus. 

PagerDuty’s mission is to revolutionize operations so that teams spend less time on reactive, break-fix work, and more time delivering new innovation while meeting their desired resiliency objectives. We see this future of operations extending beyond the digital teams that build and run software, to all teams in the organization. 

At Summit 2022 we are announcing several updates to the PagerDuty Operations Cloud that move teams towards this vision. 

Automate Everywhere

PagerDuty helps customers automate their operations. With PagerDuty, organizations can accelerate responses to urgent IT incidents, customer issues, and even non-urgent day-to-day operations. 

Today, at PagerDuty Summit, PagerDuty is announcing new automation capabilities for creating custom incident response workflows, focusing responders on issues that matter, more ways to speed up MTTR through automated diagnostics and remediations, and more power for your customer service agents to proactively support your customers.

Incident Workflows: First, we’re excited to share that PagerDuty is integrating the powerful Catalytic workflow engine to support customizable Incident Workflows. With Incident Workflows, you define the workflow logic to configure a sequence of common incident actions—such as adding a responder, subscribing stakeholders, or starting a conference bridge—into an orchestrated response. These workflows run automatically, triggered by common events like a change in priority. Learn more in this summary of what’s new in PagerDuty Incident Response.

Auto-Pause Incident Notifications: Responders can also use automation to avoid unnecessary disruption. With PagerDuty Event Intelligence, responders can suppress transient noise with Auto-Pause Incident Notifications. This feature applies machine learning to automatically detect and pause transient alerts that historically auto-resolve themselves. For customers who experience these types of alerts, this can make a big difference in helping teams stay more productive, or better yet, stay asleep when they don’t need to be woken up by something that will self-heal. In just the first three months after release,  Auto-Pause Incident Notifications paused more than 350,000 flapping alerts. 

Learn more about Auto-Pause Incident Notifications on our website.

Automation Actions… everywhere: We’ve heard from our customers that they want to empower their first responders to diagnose and remediate common issues through automation. Last year, we introduced PagerDuty Automation Actions to securely invoke automation directly from PagerDuty. Now we’re making Automation Actions available through the PagerDuty mobile app, and within Slack. Responders can now resolve issues faster from wherever they respond, and it’s easier to collaborate with incident teams through shared results of this automation.  

But why wait for a responder to run common diagnostics? Event Orchestration users can now trigger these same automated diagnostics proactively even before your first responders acknowledge an incident, giving them the information they need to act faster. These same triggers can also attempt self healing for well known issues, only paging humans if the most likely automated fix fails to resolve an issue. 

Finally, we’ve extended this ability to provide automation to your customer service teams. Your customer service agents can now invoke automated validation tests from PagerDuty Customer Service Ops to determine if a customer’s issue is related to a system problem, and proactively receive information about known problems that could potentially be affecting their customer’s experience. 

Read more in our update blog post on Automation Actions

PagerDuty Runbook Automation: PagerDuty is making it easier for our customers to orchestrate automated diagnostics, remediations, and day-to-day operations through our recently launched SaaS offering PagerDuty Runbook Automation. PagerDuty Runbook Automation is based on the PagerDuty Process Automation software, and provides access to the same development and runtime environment without having to host a cluster. Its plugins allow teams to easily incorporate their infrastructure as nodes and reuse automation in end-to-end orchestrated job definitions. 

Use PagerDuty Runbook Automation to create business-relevant automation to speed up incident response, customer service, developer experience, and compliance. Learn more about optimizing cloud operations with Runbook Automation

Connect Everyone & Everything

The number of different systems in the modern digital landscape has exploded. Keeping systems and teams synchronized is critical to have a complete picture of what’s happening and what needs attention. With APIs, webhooks, and over 650 integrations, the PagerDuty Operations Cloud integrates with your tech stack, breaking through silos so that teams can work better together and provide customers with a better experience.

Status Update Notification Templates: PagerDuty has added new ways to keep internal stakeholders updated. Now you can standardize internal communications during incident response with HTML templates. For example, customize responses in a rich text editor and add images, screenshots, or graphs. Learn more about Status Update Notification Templates.

PagerDuty for Salesforce Service Cloud updates: Customer service agents are on the front lines during an incident and shouldn’t be left in the dark. This is where PagerDuty Customer Service Operations comes in. 

With our deepening integration with Salesforce Service Cloud, PagerDuty for Customer Service Operations creates a real-time link between DevOps, ITOps and Customer Service teams with Salesforce Incident Objects and cases. This creates a single source of truth that brings all teams together. Agents are now fully empowered to engage and collaborate across teams to help resolve customer impacting issues. Learn more about updates to PagerDuty Customer Service Operations

CollabOps updates: CollabOps has become the way in which many teams work and communicate in real-time. We’ve simplified our integration with Slack with a single connection management page. And Google recently built a new integration: PagerDuty for Google Chat. Learn more about PagerDuty for Google Chat.

Deliver Speed and Flexibility

When it comes to incident response, it often feels like nothing can happen fast enough. Time is often wasted swiveling between systems to get information, manually reviewing notifications, and digging through interfaces to find what you need. Helping teams focus helps them work – and drive to resolution – faster. That’s why we’ve prioritized building features designed to help responders get what they need quickly, wherever and however they need it, so that PagerDuty can fit seamlessly within the way that each of our customers uniquely approach incident response processes. 

Custom Fields on Incidents: It starts with surfacing information where you need it most. Today, PagerDuty is announcing Custom Fields on incidents. Custom fields give responders access to critical contextual information from any surface, whether API, web, mobile app, or SMS. With more information pulled into custom fields, responders can triage and resolve issues faster. 

PagerDuty mobile app updates. We’ve also redesigned the home screen of the PagerDuty mobile app. Key responder careabouts are now front and center, further accelerating incident resolution. With a single tap, responders can view details and take action. The carousel displays all options for responders to gain understanding while on the go. This feature is in Early Access now – to get access, sign your account up here

Terraform support for Event Orchestration: Event Orchestration helps teams cut down on manual event processing by harnessing complex logic and rule nesting. We’re seeing customers replace ten Event Rules with a single Event Orchestration — that’s 90% more efficient! 

For customers who have been asking about more Terraform support with Event Orchestration to use as a part of their Infrastructure-as-Code practices, we heard you! You can now configure orchestrations in Terraform to easily create, manage, and modify orchestrations at scale. Check out the Terraform provider documentation.

Continuously Improve

An incident doesn’t end with resolution. Investing in culture, implementing best practices, and learning from what happened in previous incidents helps teams build resilience. 

Service Standards: As customers mature their digital operations, they often want to standardize what ‘good’ looks like across teams. With Service Standards, teams can configure services according to best practices. For example, account owners can audit things like service dependencies, or a service’s Escalation Policy with multiple levels. Service Standards provide custom guardrails to help scale service ownership across the entire organization. Learn about Service Standards

Next generation reports: Developing a more proactive operational posture starts with understanding how things are working today. From there, teams can identify opportunities for fine-tuning and improvement. Newly enhanced reports in PagerDuty help teams make better data-driven decisions. 

First up: the Service Performance Report has new interactive visualizations, intuitive service drill-down capabilities, a new Response Effort metric measuring the engagement time to solve an incident(s), and more filtering options to help prioritize your incidents. Stay tuned for updates to the Incident Activity, Escalation Policy, and Responder Health reports.

Register now for PagerDuty Summit to learn more about any of these announcements, both in the keynotes and deeper dive virtual breakout sessions. Ready to get started? Contact sales or sign up for a 14-day free trial to get started.

Follow us on our blog, Twitter, LinkedIn, Facebook, Instagram, or Twitch channels for the latest PagerDuty news and insights.