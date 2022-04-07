Getty Images/iStockphoto
Atlassian cloud outage could take days to resolve
Atlassian core cloud products were down for more than 24 hours this week, even as the company upped its Enterprise SLA and scale.
Atlassian launched three new cloud-based products at its Team '22 conference this week, but a major cloud outage hobbled its core services and distracted industry watchers from the new releases.
Atlassian cloud products including Jira Software issue tracking, Jira Service Management ITSM, Jira Work Management, Confluence documentation, Opsgenie incident response and the Access single sign-on tool became inaccessible to an unspecified subset of users Tuesday, and efforts to restore them continued as of midday Thursday.
"While conducting a routine maintenance script, a small number of sites were unintentionally disabled, which resulted in them being unable to access their products and data," the company said in a statement issued through a spokesperson. "We know our customers rely on our products to get their work done, and we are sorry for the disruption this has caused. We are working 24/7 to restore products to full availability."
In a later statement, the company said the incident was not the result of a cyberattack and there has been no unauthorized access to customer data. The company added that, while hundreds of engineers are working to recover the sites, it is also adding recovery automation to allow it to recover sites faster in the future.
"Due to the unique configuration of each site as well as the care we are taking to ensure safe data restoration, we estimate that full resolution could take days, though we expect customers to begin seeing restoration on a product-by-product basis sooner," the second statement said.
The company will publish a post-mortem after the incident is resolved, according to the latter statement.
Another update from Atlassian's official support account on Twitter Thursday afternoon Eastern Time appeared to indicate that some customers had suffered data loss.
We expect most site recoveries to occur with minimal or no data loss.— Ask Atlassian (@AskAtlassian) April 7, 2022
"This is extremely concerning to us, as our mission-critical institutional knowledge lives in Confluence at this point," one business customer wrote in an email to SearchITOperations. The customer, who requested anonymity, added that, "This message runs counter to the 'maintenance script has disabled a small number of sites' message we’ve been getting over and over again. This would also explain why recovery has taken days with so many engineers 'working 24/7.'"
Atlassian outage impact remains uncertain
It's too soon to tell how the outages will affect Atlassian's business, but industry observers agreed the timing was exceptionally poor given the company's ongoing emphasis on its cloud-based services during the past 18 months. Atlassian's public statements have been especially frank over the last year about the company's increased emphasis on cloud tools and added incentives for users to migrate away from its on-premises tools, where it has discontinued its midmarket Server editions and raised enterprise licensing prices.
"Many Atlassian products are from acquisitions, and moving to a subscription model while integrating each product is not easy," said Larry Carvalho, independent analyst at RobustCloud. "Multi-day downtime does not do well to convince customers to make a move."
Other experts, however, preferred to wait and see how long the outage lasts and how it's resolved before predicting its ultimate impact.
"It depends on how soon they fix it, how major the problem was and what promises they make going forward," said Andy Thurai, vice president and principal analyst at Constellation Research. "Any cloud, including AWS, will go through this. It all depends on how they handle it."
Atlassian had a poor reputation for reliability in its cloud services during an initial self-managed foray into SaaS years ago, but a move to microservices on AWS in 2019 and the introduction of enterprise security features and service-level agreements (SLA) did much to reassure early skeptics. The company has since had a good overall track record of cloud availability, and announced a 99.95% uptime SLA for its Enterprise cloud edition this week at the Team '22 conference, along with an early access program for scaling its cloud instances to support up to 50,000 users.
At least initially, the outage did little to change Atlassian users' existing views on its cloud products, whether they were positive or negative.
"You expect outages from time to time," said Chris Riley, senior manager of developer relations at marketing tech firm HubSpot, which uses Jira Software Cloud but was not affected by this week's outage. "But I actually can't recall a single outage [with Atlassian]."
Other IT pros said this week's downtime reinforced their reluctance to use Atlassian cloud for production apps.
"I typically only use Atlassian Cloud products for testing," said Rodney Nissen, senior Atlassian admin at Activision Blizzard, which uses Jira Data Center on-premises. "The thing to remember with any cloud offering is that these systems aren't magical, they are just someone else's computer. They are subject to the same errors and problems that could plague any other system."
Beth Pariseau, senior news writer at TechTarget, is an award-winning veteran of IT journalism. She can be reached at [email protected] or on Twitter @PariseauTT.