Owning Incident Response: It’s All About The Iterative Improvements
Recently, I was putting together training material for our upcoming track on “Owning Incident Response” at PagerDuty University, and I listened to the recordings of incident...
Recently, I was putting together training material for our upcoming track on “Owning Incident Response” at PagerDuty University, and I listened to the recordings of incident...
Being on-call is already a demanding and sometimes very unforgiving responsibility. If you are working in a regulated industry, however, the demands that incident management...
6 min read
In recent years, it feels as if many major brands have suffered major infrastructure failures during one of the busiest holiday shopping days — Black...
5 min read
The point of continuous integration is to automate builds and tests, and bring efficiency and quality to the pipeline. However, things do sometimes go wrong...
The threat landscape is expanding at a crazy pace. There are new vulnerabilities released every day, and the amount of servers, applications, and endpoints for...
In our always-on, IoT-enabled, cloud-connected, big data age, we face a major paradox: it’s now easier than ever to collect large amounts of data —...
Credit: NASA Organizations need many incident commanders to provide a high level of service to their customers while avoiding on-call load. Many shy away from...
About a year ago, some technical difficulties at Citi temporarily shut off a few hundred thousand cards and a swath of ATMs at the same...
“Incident lifecycle management? If we manage to stay alive from one incident to the next, it’s a good day. On a bad day, it’s all...
Today, we’re excited to announce a suite of new functionality to power even faster resolution and accelerate learning from major business-impacting incidents with the definitive...
Your high school history teacher no doubt delivered to you some variation on George Santayana’s famous remark that, “those who cannot remember the past are...
The Internet of Things (IoT) is starting to become very popular in the lives of people, and in enterprises globally. While it began as a novelty, more...
The fear of failure can be a massive hurdle for many development and ops team members. This fear can be so overbearing that morale across...
Incident management is paramount to the success of any modern ITOps team. However, much like growing a business, scaling incident management can also trigger growing...
Incident response bottlenecks – you know they’re real and you know that your incident response system probably has a few, but they must be minimized...
It’s critical to have the right tools in place before a firefight happens. A lack of proper tooling makes it significantly more difficult to recognize, organize,...
If technical debt were like monetary debt, it would be hard to keep track of it unless you checked in manually. The only way many...
According to a roundup by Gartner, the average cost of downtime for an enterprise is $5,600 per minute. While the data collected was from incredibly...