donaldihunter 6 years ago

I can't shake the feeling that Intel are trying to take credit for retpoline. They make no effort to credit the folk at Google who came up with retpoline. You have to follow the reference in section 8 to infer that Google are the saviours here.

  • hansendc 6 years ago

    Disclaimer: I put a big chunk of that paper together.

    That definitely was not the intent! Getting retpoline to the place it is today took a ton of work from a ton of people, including the awesome folks at Google like Paul Turner, and countless people in the Linux community.

  • mtgx 6 years ago

    They also did this with the ME bugs Positive Technologies found last fall. They made it sound as if their own investigation found the bugs, when they were given to them on a silver platter:

    > In response to issues identified by external researchers, Intel has performed an in-depth comprehensive security review of its Intel® Management Engine (ME), Intel® Trusted Execution Engine (TXE), and Intel® Server Platform Services (SPS) with the objective of enhancing firmware resilience.

    As a result, Intel has identified several security vulnerabilities that could potentially place impacted platforms at risk. Systems using ME Firmware versions 6.x/7.x/8.x/9.x/10.x//11.0/11.5/11.6/11.7/11.10/11.20, SPS Firmware version 4.0, and TXE version 3.0 are impacted.

    https://security-center.intel.com/advisory.aspx?intelid=INTE...

    And pretty much all of Intel's press releases surrounding Meltdown and Spectre have been almost as misleading, too, one way or another.

  • jaydj 6 years ago

    Credit goes to Google tech infra Software Engineer, Paul Turner who initially created the mitigation. Many CPU cycles have been saved!

    [iwork@google]

Scaevolus 6 years ago

There are already compiler patches for retpoline, but section 5.2 is alarming for Skylake and above:

"When the return stack buffer “stack” is empty on [>= Skylake] processors, a RET instruction may speculate based on the contents of the indirect branch predictor, the structure that retpoline is designed to avoid. The RSB may become empty under the following conditions:

1. Call stacks deeper than the minimum RSB depth (16) may empty the RSB when executing RET instructions. This includes CALL instructions and RET instructions within aborting TSX transactions.

[list of ~10 other situations that empty the RSB stack]"

They describe an "RSB stuffing" procedure, but I don't see any realistic way to guarantee that it happens properly with general code. How many call stacks do you have that are more than 16 frames deep? How many of those are recursive or dynamic?

  • hansendc 6 years ago

    Disclaimer: I put a big chunk of that paper together.

    You ask how it can be guaranteed with "general code". The first thing to remember is that retpoline is not for "general code". Linux, for instance, does not support arbitrary call depth and barely uses recursion.

    Also, take a close look at the "Exploit Composition" section of the paper. Those five conditions are much harder to satisfy at 'RET' than they are during the demonstrated Spectre variant 2 exploit points. For instance, a long speculation window (#5) for 'RET' is interesting to generate since it means a stall while waiting on something to come off the stack.

    • anarazel 6 years ago

      But there's plenty userland software that has code with different privilege levels running in it. E.g. databases.

    • jesup 6 years ago

      "Linux, for instance, does not support arbitrary call depth and barely uses recursion." Perhaps there's context missing from that statement? Linux certainly can have arbitrarily deep call depth, depending on the stack size. Are you referring to the kernel? That would be odd, since the paper talks about application code needing to be fully recompiled with retpoline to be safe.

      (Of course, that means that all libraries you link with or dynamically load have to be compiled with retpoline too.)

      • hansendc 6 years ago

        Yes, I was referring to the Linux kernel: it does barely uses recursion and has small stack sizes compared to normal applications.