5 min Security

Microsoft summit to prevent repeat of IT outage yields no real results

Culture shift is needed to discuss security more honestly

Microsoft summit to prevent repeat of IT outage yields no real results

Microsoft hosted a security summit on September 10. Without any media present, the company discussed possible measures against a repeat of the July 19 global IT outage. For now, the payoff has been modest. To the outside world, Microsoft and other security vendors have only presented familiar answers without offering new solutions.

The pedigree of the participants at the Windows Endpoint Security Ecosystem Summit was undeniable. In addition to Microsoft and CrowdStrike, representatives from Broadcom, ESET, SentinelOne, Sophos, Trellix and Trend Micro were present.

Not listening well enough?

In spots, the Microsoft findings read like a scathing report on CrowdStrike’s failings. That will not have been the intent, since the tone is emphatically cooperative elsewhere. After all, Microsoft suggests that the entire ecosystem of partners will learn together from each other’s findings.

Still, it cannot be ignored that the best practices cited by Microsofy aren’t anything new. One such example is the set of Safe Deployment Practices (SDPs) that have been laid out in official documentation for years. In other words, every vendor should have already been well aware of how security should be implemented on Windows devices.

The CrowdStrike update that took out 8.5 million Windows devices in one fell swoop on July 19 did not follow these SDPs. Instead, each user received a simultaneous, instant update to the CrowdStrike Falcon sensor that blue screened Windows systems worldwide. Ever since, however, it is possible to control when this sensor is provided with new security information. That at least follows the SDP in which Microsoft proposes to roll out updates incrementally and roll them back where necessary. It’s a basic requirement, as mentioned earlier.

Read more: CrowdStrike reveals cause of global Windows blue screen problems

Short and long term

Looking at the new Microsoft blog in isolation, there’s little wrong with it. It never hurts to reiterate best practices and guidelines. Broadcom, Sophos and Trend Micro also revealed how they are interpreting secure deployment. As a collective, the Microsoft Virus Initiative (MVI) seeks to draw learnings that can benefit all vendors.

That is the main payoff in the short term. Furthermore, Microsoft wants to coordinate faster and more clearly with industry players to keep critical components secure. As a long-term effort, the tech giant points to security capabilities within Windows 11 that will eventually enable vendors to do less in kernel mode, the state in which a single software error can cause a Blue Screen of Death (BSOD).

Seeing is believing

As mentioned, no press was allowed to attend the Windows Endpoint Security Ecosystem Summit. Perhaps this gave vendors a chance to be more explicit about the exact pain points. Still, Microsoft deems it necessary to mention the findings from the summit, and it did so in fairly short order. These are, as currently articulated, rather meager. This may very well fail to do justice to the real progress that took place on Sept. 10. But that, unfortunately, cannot be ascertained.

We hope more concrete solutions have been addressed, such as a possible Windows equivalent of eBPF. This Linux mode allows security vendors to perform kernel-level actions (such as checking files in system memory) without actually being in kernel mode. This ensures no state emerges in which the OS crashes the system if the executed code has potentially harmful effects.

This makes the messaging mostly an implicit suggestion that everything was actually already perfectly figured out by Microsoft, but that everyone else isn’t listening closely enough.

In the process, Microsoft is creating its own reality in which there’s never room for a mea culpa. For example, couldn’t Microsoft have been much clearer about these best practices? After all, it has had to allow third parties to operate in kernel mode since 2009, when the EU decided third parties were disadvantaged by not having the same level of access to the OS as Microsoft. Why was passing the Windows Hardware Certification (WHQL) requirements not required for every update, even if it didn’t alter kernel code? How was it able to do this for years without Microsoft seeing the potential problems with this methodology?

This is a persistent issue: after all, anyone looking at the company’s major security flaws soon comes across tribes of explanations from Microsoft which fail to acknowledge its own shortcomings. And this is happening at a time when both Russia and China have proven to be able to infiltrate the tech giant’s IT environments.

Read also: Russia-backed hackers attack Microsoft: senior leadership hacked

Responsibility

CrowdStrike was quick to clarify the much more explosive system failure it caused on July 19. Yet there, too, the same rhetoric has been prevalent, with the company highlighting the strengths of its own performance. This may be for legal reasons, knowing that Delta Airlines, for example, expects a much larger claim than CrowdStrike is willing to pay out.

Obviously, organizations as the end users also bear a responsibility. This point is also made in the recent Microsoft blog. However, it is not surprising that customers have been lulled to sleep in this area. Vendors who make lofty promises and venture deep into IT systems claim a responsibility and dependency that end users must rely on. After all, the promises of fast, comprehensive security are as impressive as the language used by Microsoft, CrowdStrike and others to uphold their reputations after incidents. A culture change on that score would be highly desirable, but as of yet it has not materialized.

Also read: Should CrowdStrike pay for global IT failure?