Microsoft AI researchers mistakenly expose 38 TB of data

Microsoft said no customer data was affected by the Azure Storage exposure and 'no other internal services were put at risk because of this issue,' which has been mitigated.

Alexander Culafi, Senior News Writer, Dark Reading

Published: 18 Sep 2023

Cloud security vendor Wiz discovered 38 TB of private Microsoft data that was accidentally exposed by AI researchers employed by the tech giant.

Wiz's research was published in a blog post Monday as part of coordinated disclosure with Microsoft. According to Wiz security researchers Hillai Ben-Sasson and Ronny Greenberg, who authored the research, Microsoft's AI research team accidentally exposed 38 TB of private data while publishing "a bucket of open-source training data on GitHub." Exposed data included a disk backup of two employee workstations, as well as passwords, private keys, secrets and "over 30,000 internal Microsoft Teams messages."

Data was exposed, Wiz said, because the AI researchers shared files using shared access signature (SAS) tokens, which are signed URLs in Azure Storage used to share data and manage share permissions. However, while share permissions can be limited strictly and on a by-file basis, "the link was configured to share the entire storage account -- including another 38TB of private files," Wiz researchers wrote.

"In addition to the overly permissive access scope, the token was also misconfigured to allow 'full control' permissions instead of read-only," the post read. "Meaning, not only could an attacker view all the files in the storage account, but they could delete and overwrite existing files as well."

SAS tokens carry security risks when not carefully managed because inappropriate permissions could provide significant access to sensitive data. Moreover, as security vendor Orca Security noted in June, threat actors can discover exposed cloud assets in mere minutes.

Wiz found the link to the exposed data as a repository in Microsoft-owned GitHub during the cloud security vendor's regular internet scans. Shir Tamari, Wiz head of research, told TechTarget Editorial the "overly permissible token" had been publicly accessible on GitHub for the last three years, "making it possible for anyone to locate."

"A threat actor would not need deep technical expertise to gain access to this data. It could have been discovered and exploited by practically anyone," Tamari said.

A visual from Wiz's blog post regarding SAS token security risks. — Wiz shared the security risks in using SAS tokens in its Monday blog post.

In response, the Microsoft Security Response Center published a blog post Monday dedicated to the exposure. The tech giant said it addressed the issue and emphasized that no customer data was exposed, no customer action is required and "no other internal services were put at risk because of this issue."

"After identifying the exposure, Wiz reported the issue to the Microsoft Security Response Center (MSRC) on June 22nd, 2023. Once notified, MSRC worked with the relevant research and engineering teams to revoke the SAS token and prevent all external access to the storage account, mitigating the issue on June 24th, 2023," the post read. "Additional investigation then took place to understand any potential impact to our customers and/or business continuity. Our investigation concluded that there was no risk to customers as a result of this exposure."

TechTarget Editorial asked a Microsoft spokesperson whether any data had been exfiltrated beyond Wiz's research. In response, the spokesperson said, "Our investigation did not identify any other unintended access."

The misconfigured SAS token and data exposure follow a series of issues at the technology giant that have come to light in the wake of the Storm-0558 attacks. In July, Microsoft disclosed that a China-based threat actor it identifies as Storm-0558 had compromised the email systems of approximately 25 customers, some of which included federal government agencies. Microsoft revealed that the attackers used a stolen Microsoft account (MSA) sign-in key and exploited a token validation issue.

Microsoft later disclosed that it had committed a series of errors that allowed the threat actors to steal the MSA key from its corporate network and use it to compromise customer email accounts. Those errors included allowing email systems to grant requests for enterprise email using a security token signed with an MSA key, which Microsoft later corrected.

Alexander Culafi is an information security news writer, journalist and podcaster based in Boston.

Microsoft AI researchers mistakenly expose 38 TB of data

Microsoft said no customer data was affected by the Azure Storage exposure and 'no other internal services were put at risk because of this issue,' which has been mitigated.

Dig Deeper on Cloud security

DeepSeek explained: Everything you need to know

Lottie Player NPM package compromised in supply chain attack

Microsoft issues first Secure Future Initiative report

Congress grills Microsoft president over security failures