Google Explains Meltdown, Spectre Fix Impact On Cloud Services

Google says that it wasn’t Meltdown that had the greatest impact on its cloud services but Spectre Variant 2. To fix it, the company created Retpoline, a software-only solution that regular users unfortunately can’t benefit from.

Unlike many had predicted, Meltdown--the Intel-only vulnerability that is fixed by forcing the CPU to reload its TLB when running a kernel process--wasn’t the biggest headache for Google. In a blog post that goes into detail on the impact of Meltdown/Spectre on Google apps’ backend, Google said that, because of the amount of time they had known about Meltdown, “extensive performance tuning work” made it so that by the time they deployed the patch for it in October, the “protections caused no perceptible impact in [its] cloud.”

The real headache for Google turned out to be Spectre Variant 2. The hardware fix was to outright disable some forms of speculative execution in the CPU, rather than just nullify them in the situations that matter, which is what the fix for Meltdown does. The performance impact of this was significant. Google explained:

Not only did we see considerable slowdowns for many applications, we also noticed inconsistent performance, since the speed of one application could be impacted by the behavior of other applications running on the same core. Rolling out these mitigations would have negatively impacted many customers.

Without anything to lose, Google looked into “moonshot” solutions and ended up devising Retpoline, a software-only solution that avoided any hardware change and caused “almost no performance loss.” Being the obvious solution, Retpoline was deployed across Google’s infrastructure and shared with others.

Spectre Variant 2 is the one for which we need BIOS fixes. If Retpoline fixes it without requiring any hardware change, then why do we need BIOS fixes? Retpoline is a software fix, but it’s a compile-time fix. That means that its a change implemented in the software compiler that will modify the final executable that comes out. It doesn’t mean that software has to be rewritten, but it does mean that it has to be recompiled.

For closed systems running proprietary software, which is what the Google apps’ backend is, Retpoline is the ultimate solution. It doesn’t require rewriting high-level code in individual programs. Just recompile them and it’s done. However, that’s not going to work for regular users’ systems because it’s impossible to ensure that every program that everyone runs has been re-compiled with Retpoline. As a result, to secure against Spectre Variant 2, user systems have to be patched on the hardware level. Now we don’t know specifically what the BIOS fixes that Intel, and now AMD, are pushing out do, but, from what Google says, we assume that they disable some forms of speculative execution.

The embargo period for Meltdown/Spectre and Retpoline is why we didn’t see the massive impact to cloud service provider, who many predicted would be hit the hardest by the fixes. Many datacenter workloads are proprietary software, so their creators have full control over what is being run. Since we regular users don’t have this visibility into our software, we might end up being the hardest hit by the Meltdown/Spectre fixes.

Create a new thread in the UK News comments forum about this subject
2 comments
Comment from the forums
    Your comment
  • Dooger
    I'm running an Asus P8Z77 Pro Motherboard with an Ivy Bridge i7-3770k CPU. After a recent update on my windows 7 64 bit computer it was hanging after log in for 20 to 30 seconds. I believe the update (KB4056894) was to resolve this issue but it was driving my nuts having to wait that long to start doing anything so I reset prier to the update and everything's okay again. How is this issue any different then the VW emissions scandal?
    0
  • Snipergod87
    Anonymous said:
    I'm running an Asus P8Z77 Pro Motherboard with an Ivy Bridge i7-3770k CPU. After a recent update on my windows 7 64 bit computer it was hanging after log in for 20 to 30 seconds. I believe the update (KB4056894) was to resolve this issue but it was driving my nuts having to wait that long to start doing anything so I reset prier to the update and everything's okay again. How is this issue any different then the VW emissions scandal?


    Quite a bit different as the VW Emissions scandal was done purposely, the bug that needs to be fixed was not put there on purpose its just a bug, all software and hardware have bugs. The people jumping on the bandwagon to sue Intel over this bug are off their rockers, if we sued every hardware and software company for every bug nobody would write code or make hardware.

    I'm not saying im happy with all these bugs , like Intel's AMT, Intels Management Engine, and Meltdown, along with Spectre which AMD Intel and ARM.
    0