kantver - Fotolia

PagerDuty incident response tools loop in business stakeholders

PagerDuty follows Atlassian with incident response tracking for corporate management, while IT pros on the front lines of incident triage also seek enhancements to low-level alerting configurations.

The scope of PagerDuty incident response tools widened this week, with the addition of two products that give business stakeholders visibility into incident response.

PagerDuty Visibility offers corporate managers a view into ongoing incidents as IT mobilizes to address them. This access helps business units react effectively to external customer concerns about the incidents, and it frees up IT first responders to address the incident, rather than update business stakeholders on the situation.

"As part of the triage team, we want to take care of the business side. But we don't want them coming into our conference bridge asking for an update when we could be spending that time on triage," said Ben Hwang, cloud automation leader at GE Digital, General Electric's software engineering division, which uses PagerDuty and helped test PagerDuty Visibility in its alpha and beta stages. "PagerDuty Visibility is better than the internal tools we've been using, and our IT organization has been waiting for it to launch."

Another product released this week, PagerDuty Analytics, is a counterpart to Visibility that offers business managers a long-term view of incident response trends in cost and average time to resolution. Hwang said this product also appeals to him, but would require wider adoption of PagerDuty across more GE Digital teams to be truly effective. Many of these teams are in flux while GE plans to spin them off.

These products are part of a trend in IT incident response that fits into broader shifts in corporate thinking around IT service management. Atlassian adjusted its incident response tools this month with the acquisition of OpsGenie and the launch of Jira Ops, which also offers visibility into ongoing incidents for business stakeholders. Splunk also expanded its incident response offerings with the acquisition of VictorOps in June 2018.

IT shops want PagerDuty incident response flexibility

PagerDuty users are intrigued by the company's additional incident response tools, but they said the core triage product also needs work to keep pace with complex organizational structures that arise from microservices architectures.

At SPS Commerce, for example, PagerDuty should notify database teams when an incident affects certain apps with database dependencies, but the company also wants to track those incidents according to the service they're part of, said Andy Domeier, director of technology operations at the communications network for supply chain and logistics businesses in Minneapolis.

[PagerDuty Visibility] has a lot of great potential ... but we pulled out of the beta because of the issues with service alignment.
Andy Domeierdirector of technology operations, SPS Commerce

However, today, PagerDuty incident response notifies all members of a particular application or service team even if it only must route the notification to the database staff.

"Right now, we rely on our team in India to route notifications to the database team," Domeier said. "We'd like to be able to override escalation policies for certain integrations so that they're still associated with a certain application, but only go to the database team."

PagerDuty Visibility and Analytics look interesting, but until that more basic notification-routing issue is solved, SPS will struggle to adopt those capabilities, Domeier said.

"We provided early feedback for PagerDuty Visibility, and it has a lot of great potential," he said. "But we pulled out of the beta because of the issues with service alignment."

GE Digital's Hwang said he's had a similar problem with escalation policies and alignment with distributed teams. But he added that PagerDuty Visibility offers more flexibility with escalation policies he hopes will also find its way into other PagerDuty products.

That is the plan. PagerDuty Visibility introduces business services and a service dependence hierarchy, which will eventually be added to the core platform for alerting and escalation, a PagerDuty spokesperson said in an email.

PagerDuty has also reconfigured its products into new packages that were previously offered as one platform, but are now split into three major product tiers:

  • PagerDuty Platform Team is for single teams within larger organizations.
  • PagerDuty Business is for multiple teams, and it also includes more advanced scalability, high availability and security features.
  • PagerDuty Enterprise bundles in products that are now add-ons for the lower tiers, including Modern Incident Response, Event Intelligence and PagerDuty Visibility.

PagerDuty Analytics is a separate add-on for all levels.

Modern Incident Response, which provides visibility into ongoing incidents, overlaps somewhat with PagerDuty Visibility. And Event Intelligence, which uses analytics to offer triage recommendations, overlaps with PagerDuty Analytics.

The chief differences between these tools are their target audiences, PagerDuty officials said. Modern Incident Response and Event Intelligence are aimed at incident response teams that handle triage, while Visibility and Analytics are meant for business stakeholders within the wider organization.

PagerDuty incident response pricing

PagerDuty Visibility is available as an add-on to customers on Lite, Platform Team and Platform Business. The add-on list price is $15 per user, per month when purchased annually and $18 per user, per month when purchased month to month for all users in the account, excluding stakeholder users. It will be included for all customers on Enterprise and given to all customers with the existing Operations Command Console add-on at no additional cost.

PagerDuty Analytics is an add-on to all levels of the PagerDuty platform, and pricing information was not disclosed as of press time.

Editor's note: Pricing information was added after publication.

Dig Deeper on IT systems management and monitoring

Software Quality
App Architecture
Cloud Computing
Data Center