CrowdStrike incident

One of the largest American cybersecurity software companies, CrowdStrike, disabled access to approximately 8.5 million Windows 10 and 11 workstations and servers on July 19, 2024. This event marks the largest collapse of information systems ever.

What happened?

On the critical day, CrowdStrike released an updated version of its software, which triggered the infamous Blue Screen of Death (BSOD) on all protected Windows machines. After every restart, users found themselves in the same situation, making it seem like access to their devices was permanently disabled.

Who was affected?

Many major companies and public services were hit by the incident, including airlines, government agencies, hospitals, banks, and railways. Airline ticketing systems failed, resulting in flight cancellations. Healthcare systems experienced outages, leaving some patients without care. Millions of office workers found themselves unable to access their computers. Initially, many assumed it was a cyberattack. System administrators canceled vacations and worked overtime to manually resolve the issues on each system.

Was Microsoft at fault?

Although many initially blamed Microsoft, it was quickly revealed that the tech giant was not responsible for this incident. The event caused massive financial losses and security breaches, the full extent of which will likely be assessed in the future.

Lessons and prevention

The incident underscores the importance of robust cybersecurity measures and adherence to standards like ISO 27001:2022 to mitigate damage. Proper controls could have minimized the impact of such a widespread system failure.

This catastrophe serves as a stark reminder of the vulnerabilities in modern interconnected systems and the critical need for effective risk management in IT operations.

 

What is CrowdStrike?

CrowdStrike is a U.S.-based cybersecurity company founded in 2011, specializing in developing software and services to protect against cyberattacks, malware, and other cyber threats. Its primary product, Falcon Platform, uses advanced techniques such as artificial intelligence (AI) and machine learning (ML) to detect and prevent threats in real time. CrowdStrike’s solutions are widely used by large corporations and government agencies globally.

How Does the Falcon Platform Work?

The core component of the Falcon Platform is the Falcon Sensor, a software agent installed on workstations and servers. It continuously monitors system activity to detect potential threats. Using AI and ML, it analyzes system behavior to identify malware, malicious actions, and security incidents in real-time, providing proactive defense against emerging threats.

What Went Wrong?

On July 19, 2024, CrowdStrike released an update to the Falcon Sensor containing a critical software flaw. The issue stemmed from a faulty driver running in RING 0, also known as the kernel mode in Windows. This level of access allows the driver to interact directly with the core of the operating system. Due to the flaw, Windows systems encountered the Blue Screen of Death (BSOD), leading to repeated crashes and restarts. The issue was systemic because Windows could not resolve the error in kernel mode, forcing continuous reboot cycles.

How Was the Issue Resolved?

The solution involved using Safe Mode, a Windows feature that disables non-essential drivers during startup. By starting in Safe Mode, administrators could delete the faulty driver file from a specific directory, enabling normal system boot.

Impact on Other Operating Systems

Interestingly, the issue did not affect Linux or MacOS, as they handled the Falcon update without disruption. This may prompt some organizations to consider diversifying their operating systems to increase resilience against such incidents.

Potential Information Security Consequences

A critical scenario emerged regarding systems protected by BitLocker, a tool that encrypts data on Windows machines. If BitLocker was enabled and the encryption key was stored on an affected machine, system administrators needed the key to access data. In such cases, those who had offline backups of the encryption key were fortunate, while others risked permanent data loss.

Future Implications

This incident has raised concerns about trust in major cybersecurity companies. It highlights the need for robust contingency planning, as outlined by standards like ISO 27001, which emphasizes risk management, incident response, and data protection practices.

CrowdStrike’s mishap serves as a reminder of the fragility of even the most advanced security systems and the importance of diversified defenses and stringent quality assurance in cybersecurity software development.