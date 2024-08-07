CrowdStrike on Tuesday published its full root cause analysis for last month's defective channel file update that caused more than 8 million Windows systems to enter reboot loops.

The report, title "External Technical Root Cause Analysis -- Channel File 291," shed light on the factors that led to the botched Falcon sensor update being delivered to CrowdStrike customers, which trigged a mass IT outage on July 19th. The cybersecurity vendor had previously issued a preliminary report that attributed the incident to a vulnerability in the company's content validator, which allowed the channel file to pass through internal checks.

Tuesday's 12-page report shed additional light on the incident. According to CrowdStrike, a series of issues contributed to the errant release of channel file 291, starting with an earlier update to its Falcon platform. The company had previously released version 7.1 of its Windows sensors in February, which contained a new type of inter-process communication (IPC) template.

"The new IPC Template Type defined 21 input parameter fields, but the integration code that invoked the Content Interpreter with Channel File 291's Template Instances supplied only 20 input values to match against," the report said. "This parameter count mismatch evaded multiple layers of build validation and testing, as it was not discovered during the sensor release testing process, the Template Type (using a test Template Instance) stress testing or the first several successful deployments of IPC Template Instances in the field. In part, this was due to the use of wildcard matching criteria for the 21st input during testing and in the initial IPC Template Instances."

CrowdStrike explained that two IPC templates were released on July 19, one of which included a non-wildcard matching criterion for the 21st input parameter. The template instances triggered a new version of channel file 291 that required the sensors to review the 21st input parameter. However, CrowdStrike said no IPC templates in previous channel file updates had used the 21st input parameter field.

"Sensors that received the new version of Channel File 291 carrying the problematic content were exposed to a latent out-of-bounds read issue in the Content Interpreter. At the next IPC notification from the operating system, the new IPC Template Instances were evaluated, specifying a comparison against the 21st input value," the report said. "The Content Interpreter expected only 20 values. Therefore, the attempt to access the 21st value produced an out-of-bounds memory read beyond the end of the input data array and resulted in a system crash."

A logic error in the content validator allowed channel file 291 to be sent to the content interpreter, CrowdStrike said. While the defective channel file was merely a content update for configuration settings, it impacted the sensor software, including the Windows kernel driver, running on customers' systems. The incident has sparked a debate within the tech industry over whether Microsoft should provide kernel-level access to third-party vendors like CrowdStrike.

"In summary, it was the confluence of these issues that resulted in a system crash: the mismatch between the 21 inputs validated by the Content Validator versus the 20 provided to the Content Interpreter, the latent out-of-bounds read issue in the Content Interpreter, and the lack of a specific test for non-wildcard matching criteria in the 21st field," CrowdStrike wrote. "While this scenario with Channel File 291 is now incapable of recurring, it also informs process improvements and mitigation steps that CrowdStrike is deploying to ensure further enhanced resilience."