masterzphotofo - Fotolia

Texas power outage flags need to revisit business continuity

Freezing conditions that caused Texas power outages affected businesses well beyond the state's borders, prompting a need for business continuity plans to be revisited.

Texas companies aren't the only ones that may need to revisit business continuity and disaster recovery plans in the aftermath of the state's uncommonly severe winter weather.

The power outages and bursting pipes that crippled millions of residents and businesses in Texas also took a toll on companies and government agencies in states ranging from Florida to California.

California's Department of Health Care Services (DHCS), for instance, confirmed that its Medicaid Management Information System and Medi-Cal website were down from about 2 a.m. on Feb. 15 until 6:30 a.m. on Feb. 17 because of a weather-related power outage affecting the Dallas-based IBM data center that hosts them.

One of the largest companies to report an outage was State Farm. The insurance giant alerted customers about system connectivity issues "due to the extreme weather" via a Facebook post at 2:04 p.m. on Feb. 15. Updates at night and the following morning instructed customers to reach out via direct message to the company's Facebook or Twitter accounts if they needed assistance. At 12:32 p.m. on Feb. 17, a Facebook post indicated the majority of State Farm systems had been restored.

State Farm Facebook updates
State Farm provided regular updates about its system outage on its Facebook and Twitter accounts, and encouraged customers to reach out via direct message.

State Farm declined requests to provide more information on the nature of the outage. Although its headquarters is in Bloomington, Ill., State Farm built a major data and operations center in Richardson, Texas, in the last decade, according to Dallas news reports.

Greater Dallas hotbed for data centers

Texas -- and especially Dallas and communities north of the city -- has become a hotbed for data centers and colocation companies in the last 10 years, noted Bob Gill, a research vice president at Gartner. Draws include the location in the middle of the country, the availability of land, energy and fiber connectivity, favorable regulations and taxes, and the dearth of weather extremes and natural disasters.

"I still think Texas is a good place for a data center," Gill said. "Outside of Houston and down along the Gulf, Texas is considered pretty benign from the perspective of natural disasters. They just got caught on this one."

Gill said a "confluence of worst-case scenarios" lined up at the same time to cause problems. For instance, the uncharacteristic duration of a snowstorm over a widespread area created transportation issues for businesses that depend on fuel to run their generators.

Yet Gill said, despite the "prolonged misery" in Texas, major data center providers and colocation companies that he covers reported no outages to him. He said his phones were not ringing off the hook with client calls. And his advice on business continuity to those that did experience an outage remains the same.

"If you positively, absolutely can't get by without that data center being down, you better have it replicated somewhere else," Gill said. "We should have our eyes wide open from the perspective of business continuity around the country, and we should be multi-region."

Texas power outage spotlights risk factor

One potential risk factor that is unique to many parts of Texas is its independent, privatized power grid that this month's freezing temperatures drove to the brink of collapse. The Electric Reliability Council of Texas (ERCOT) ordered rolling outages to prevent a complete blackout. Five ERCOT board members resigned this week in the face of inquiries about the grid's preparation for winter conditions.

Forrester Research senior analyst Brent Ellis said the city of El Paso could serve as a potential model for ERCOT. El Paso was spared the massive outages that afflicted much of Texas because it gets power through a Western Interconnection grid that includes other Western states, northwestern Mexico and two Canadian provinces. Ellis said ERCOT should consider connections to other grids in North America because winterization efforts could prove problematic in Texas.

If you positively, absolutely can't get by without that data center being down, you better have it replicated somewhere else.
Bob GillResearch vice president, Gartner

"They have a lot of smaller generation facilities there that would have to fork over the money to do the winterization, and that affects how much they charge customers for their electricity," Ellis said. "There is a certain point where you can't charge people much more money without having customers either move or decide not to use as much electricity."

Ellis expects that major technology companies already based in Texas, such as Dell, and those planning to relocate their headquarters to Texas, including Hewlett Packard Enterprise and Oracle, will have serious discussions with Texas legislators to address the power issues.

"Texas has been trying to invite industry to have more jobs and more money in the state. They don't want to become California. They want to stay privatized and deregulated," Ellis said. "But, if they want to be able to entertain the high-tech world, there are things they're going to have to address as a state. They will see tech businesses leave rather than move there if they can't provide a stable environment to operate."

Business continuity planning across many states

DHCS faced scrutiny over how extreme weather conditions in Texas could affect Medicaid services on the west coast. DHCS operates its own data center in Sacramento, Calif., but it also leases or owns equipment at facilities in Colorado, Connecticut, Ohio, Oregon, Pennsylvania, Texas, Virginia and Washington to host services, systems and data, according to Carol Sloan, the department's public information officer.

Sloan said DHCS plans no major changes to its business continuity strategy in the wake of the Texas outage because the department routinely looks to improve system uptime, build redundancies across locations, and increase recovery time objectives. DHCS assesses and migrates applications to cloud service providers, such as AWS, and builds modern, cloud-native applications on a continual basis, she said.

Availity Twitter updates
Availity monitored its Twitter feed throughout its online service outage and provided details on the nature of the problem.

In response to its Dallas data center outage, IBM issued the following statement: "Our facilities in Texas endured the same extreme weather that other businesses across the state experienced. We worked diligently to overcome the challenges of freezing temperatures and rolling blackouts to restore service for our clients."

Another non-Texan organization to suffer an online service outage due to a problem with a data center provider in Dallas promised customers a "large-scale investigation" into the root cause of the incident. Availity, which operates a health information network to facilitate electronic transactions and collaboration and is based in Jacksonville, Fla., said the Feb. 16 downtime resulted from an electrical switching failure at its Dallas data center. Availity declined to name the provider due to security and privacy concerns.

Anatomy of an outage

Availity COO Paul Joiner said the rolling power outages required the data center to shift back and forth from utility power to generator power. The data center provider successfully transitioned in the hours before the incident, but a failure between redundant management plane switches caused a power loss to the circuits to Availity's equipment. Although the provider restored power within an hour, Availity noticed that customer-facing applications were offline.

"When there's a sudden and complete power loss, it increases the risk of component failures and requires extensive manual intervention and troubleshooting," Joiner said. "In this situation, we lost access to the network management plane that allows engineers to access, monitor and manage connected devices."

Team members made a determination that, with the power restored and stable, the quickest path to recovery would be bringing up the Dallas site rather than switching to their secondary data center in Atlanta. Availity's systems were back online by about 5:30 p.m. on Feb. 16.

"As a result of this incident, we are examining a number of people, process and technology improvement opportunities," Joiner said. "We expect the full results of this investigation will bring about new practices that will help us avoid similar issues in the future."

Next Steps

With New Orleans mostly dark, a coworking office lights up

Dig Deeper on Risk management and governance

Cloud Computing
Mobile Computing
Data Center
and ESG