NewsBite

Advertisement

‘The world is in meltdown’: Inside the front lines of the CrowdStrike outage

By David Swan

It was at about 3.30pm on an otherwise quiet and uneventful Friday that Ashwin Pal’s phone began blowing up.

He was at home working when hundreds of text messages, phone calls and emails flooded in, seemingly all at once.

“‘The world is in meltdown’ is what I said to my family, and as soon as the magnitude of what happened became clear, I told them ‘there goes the weekend, I’ll see you on Monday’,” Pal said.

Outage everywhere: Screens show a blue error message at a departure floor of LaGuardia Airport in New York.

Outage everywhere: Screens show a blue error message at a departure floor of LaGuardia Airport in New York.Credit: AP

“It looked like a cyberattack, and that’s what everybody initially thought because everybody’s computer screens just went blue at the same time. They all started ringing me up in a panic, saying ‘oh my God, help’.”

Pal is a security veteran, working for consulting and tax giant RSM Australia in Sydney as a partner for its risk advisory division. He’s spent the last two decades working with large clients on their IT systems, security, and incident response.

RSM Australia partner Ashwin Pal was at the frontlines of responding to the CrowdStrike outage.

RSM Australia partner Ashwin Pal was at the frontlines of responding to the CrowdStrike outage.

“It doesn’t matter how much you drill this stuff,” he says. “The stress levels and the adrenaline go up.”

Pal was at the front line when nearly 10 million computers were knocked offline by last Friday’s CrowdStrike outage, grounding thousands of flights worldwide and felling banks, hospitals and train lines in the worst outage in world history. IT administrators across the globe were forced in many cases to physically access affected machines to deploy a fix.

In Pal’s case, he quickly rushed into the office to work with his team, and they converted it into effectively what became a war room. From there, his team took a “divide and conquer” approach, working with each client to get their systems back online.

Advertisement

“CrowdStrike is the Ferrari of enterprise cyber detection and response, so it’s the very deep pocket clients who were impacted. This was merely a blip in South-East Asia, but developed nations like Australia and New Zealand were hit big time,” Pal said. “We were hit first while the US was still asleep.”

Loading

CrowdStrike on Thursday issued its first post-incident report into the outage. It said the incident was caused by a flaw in an update to CrowdStrike Falcon, the company’s software that sits in the background of computers to monitor for cyber threats. CrowdStrike Falcon runs at the kernel level of Windows systems, meaning it has more privileges than most other programs.

But the update was “problematic,” triggering a memory problem that set off Windows’ notorious “blue screen of death”, according to the post-incident report. Mac and Linux hosts were not affected. CrowdStrike has a “content validator” that reviews software updates before launch, but the program missed the update’s faulty content due to a bug.

Pal said rescuing his clients’ machines was a matter of entering safe mode, rebooting the machine, deleting the offending file and rebooting again.

“It required a fair bit of effort because there’s some machines you could do that remotely, but others you had to get to them physically,” he said. “There’s also a security feature that comes with Microsoft called BitLocker, which encrypts your hard drive. That had to be disabled as well before we could do anything.”

A week on from the incident, its effects are continuing to linger - it could potentially take months for some organisations to entirely recover - and hackers have begun to use the opportunity to target CrowdStrike customers.

US Republicans who lead the House Homeland Security committee said this week they want answers, and have called CrowdStrike’s chief executive George Kurtz to testify before Congress.

“While we appreciate CrowdStrike’s response and co-ordination with stakeholders, we cannot ignore the magnitude of this incident, which some have claimed is the largest IT outage in history,” said a letter to Kurtz from Mark E. Green of Tennessee and Andrew Garbarino of New York.

They added that Americans “deserve to know in detail how this incident happened and the mitigation steps CrowdStrike is taking.”

Australians, too, are demanding answers.

Melbourne-born Mike Sentonas was at the centre of Friday’s maelstrom. He’s CrowdStrike’s global president, after climbing the corporate ladder over the last decade to be one of Australia’s highest-ranking technology executives globally. Sentonas is currently based in Las Vegas and is worth an estimated $225 million.

CrowdStrike declined to make Sentonas available despite repeated requests for interviews from this masthead and detailed written questions since the outage began last Friday. He spoke to Sky News on Wednesday, where he apologised to viewers.

“We deeply apologise. I personally apologise for what happened,” he said on air.

“We understand the disruption and the distress that we caused a lot of people. And firstly, I think it’s important to say we put out an update, which we do regularly, and we’ve been doing for over a decade. And we got this very wrong.

“We identified what the issue was very quickly. We stopped that particular file from being propagated but unfortunately a lot of people around the world did get access to that file … And the experience that people had was a blue screen of death.”

Loading

CrowdStrike sent out $US10 Uber Eats gift cards to its team members and partners who worked around the clock over the weekend. So many were being redeemed that Uber flagged the gift cards as fraud because of high usage rates.

“To express our gratitude, your next cup of coffee or late-night snack is on us!” an email signed by CrowdStrike chief business officer Daniel Bernard reads, as seen by this masthead.

Uber Eats gift cards won’t be enough for affected businesses, however. Early estimates put damages from the outage at more than $1 billion in Australia alone, raising questions about who will foot the bill.

James North is head of technology for independent law firm Corrs Chambers Westgarth, which has been in talks with affected businesses over the past week.

According to North, businesses across the country are now weighing whether they can recover financial losses caused by the outage, including the need for additional IT staff and an inability to trade.

Michael Sentonas of CrowdStrike.

Michael Sentonas of CrowdStrike.Credit: James Brickwood

It’s still an open question whether businesses’ cyber policy or business interruption policies would apply, he said.

What is clearer is that CrowdStrike will only be refunding its customers their subscription fees. Its standard contracts mean the company does not have to pay up for losses incurred by an outage.

“Liability for loss of revenue and other consequential losses are excluded from CrowdStrike’s standard contract,” North said. “And Australian customers also can’t access local courts when considering legal remedies, as they were required to agree to New York governing law and arbitration in Singapore in CrowdStrike’s standard contacts.

“Some customers may have better than the standard liability arrangements with CrowdStrike. For others, Australian consumer law may offer the best approach.”

Australian businesses can access statutory guarantees in certain circumstances, North said, particularly where the purchased goods or services are valued at $100,000 or less.

“In Australia, there is a guarantee that any services will be provided with due care and skill. Where an IT vendor introduced coding errors into a software update or did not properly test the update before deploying it onto its customer’s IT systems, some may consider that guarantee to have been breached,” he said.

“A business may also recover its ‘reasonably foreseeable losses’ as a result of a vendor’s ‘major failure’ to comply with a statutory guarantee. In certain circumstances, this may include trading and other financial losses.”

Loading

He said a class action lawsuit was likely to be extremely difficult by virtue of the arbitration clauses embedded in the customer contracts.

For Pal, software companies are unlikely to change their standard contracts going forward, given outages like this one would likely bankrupt them. He said the incident underscores the importance of having adequate insurance.

Australian regulators including APRA will likely soon step in once the dust has settled, according to Pal, who pointed to CPS 230. It’s a new standard coming into effect from July 2025 that focuses on business resilience, including third-party risk management and business continuity, which were both heavily impacted by the CrowdStrike outage.

“Looking into the crystal ball now over the next six to 12 months, I’m expecting quite a bit of focus on those two areas in particular, as regulators move to make sure that when an incident like this happens, malicious or otherwise, it doesn’t basically melt the world down,” he said.

“I’m going to be blunt and say this clearly outlined the absolute failure of organisations with respect to their IT disaster recovery and business continuity plans.

“This wasn’t your run-of-the-mill incident, but nevertheless this is the stuff you’ve got to prepare for. Because if you’ve prepared for something like this, then everything else is easy.”

The Market Recap newsletter is a wrap of the day’s trading. Get it each weekday afternoon.

Most Viewed in Technology

Loading

Original URL: https://www.smh.com.au/technology/the-world-is-in-meltdown-inside-the-front-lines-of-the-crowdstrike-outage-20240724-p5jw64.html