iOS 1-day hunting: uncovering and exploiting CVE-2020-27950 kernel memory leak

Written by Fabien Perigaud - 01/12/2020 - in Exploit , Reverse-engineering - Download
Back in the beginning of November, Project Zero announced that Apple has patched a full chain of vulnerabilities that were actively exploited in the wild. This chain consists in 3 vulnerabilities: a userland RCE in FontParser as well as a memory leak and a type confusion in the kernel.

In this blogpost, we will describe how we identified and exploited the kernel memory leak.

Introduction

On November 5th, Project Zero announced that Apple has patched in iOS 14.2 a full chain of vulnerabilities that were actively exploited in the wild, composed of 3 vulnerabilities: a userland RCE in FontParser as well as a memory leak ("memory initialization issue") and a type confusion in the kernel.

Apple patching a full chain of vulnerabilities exploited in the wild is not something usual. This kind of discovery is very interesting for several reasons:

  • if the exploitation codes are made public, they give precious insights about the state-of-the-art exploitation methods for latest iOS versions, which include more and more security mitigations;
  • even if the exploitation codes are not available, the kernel vulnerabilities might be of great interest, a full chain implying defeating hardened sandboxing to be able to exploit the kernel from a userland application.

As Project Zero did not publish any details about the vulnerabilities nor exploitation methods, we started digging to find them ourselves.

Bindiffing made easy

Surprisingly, Apple chose to fix these vulnerabilities on older devices too, in iOS 12.4.9. This choice might be explained by Apple wanting to protect as many customers as it can, since these vulnerabilities are actively exploited in the wild.

From a security researcher point of view, this choice is a gift: we can grab a fresh iOS 12.4.9 kernel with the vulnerabilities patched, and compare it against an iOS 12.4.8 kernel: the list of changes will be minimal, as no new features are expected, and every change will likely be a vulnerability fix!

Getting kernels is not a complicated task: we can download the IPSW files corresponding to iOS versions 12.4.8 and 12.4.9 for an old iPhone version (such as iPhone 6) using the handy website ipsw.me, which is automatically updated with links to IPSW files by parsing the public XML files hosted by Apple. IPSW files are ZIP archives containing various files, including kernelcache.release.iphone7, which is the compressed kernel binary for our iPhone model.

Depending on the iPhone version, different compression methods can be used. The targeted iPhone 6 uses LZSS, as it can be seen in the compressed kernelcache header:

$ xxd -a kernelcache.release.iphone7 | head -n 10
00000000: 3083 d68f 3c16 0449 4d34 5016 046b 726e  0...<..IM4P..krn
00000010: 6c16 1e4b 6572 6e65 6c43 6163 6865 4275  l..KernelCacheBu
00000020: 696c 6465 722d 3134 3639 2e32 3630 2e31  ilder-1469.260.1
00000030: 3504 83d6 8f0b 636f 6d70 6c7a 7373 025a  5.....complzss.Z
00000040: b99c 01ae f208 00d5 cd8b 0000 0001 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
*
000001b0: 0000 0000 0000 ffcf faed fe0c 0000 01d5  ................
000001c0: 00f6 f002 f6f0 16f6 f058 115a f3f1 20f6  .........X.Z.. .
000001d0: f100 19f6 f028 faf0 3f5f 5f54 4558 5409  .....(..?__TEXT.

Starting at offset 0x1b6 is the compressed binary. The lzssdec tool can be used to get a clean version of the kernel binary:

$ lzssdec -o 0x1b6 < kernelcache.release.iphone7 > kernelcache.bin
$ file kernelcache.bin
kernelcache.bin: Mach-O 64-bit arm64 executable, flags:<NOUNDEFS|PIE>

Now that we have the two kernel binaries, we can start diffing. We will use Bindiff 6 for IDA Pro, but other tools can also perform well.

A kernelcache consists in the kernel binary and many kernel extensions (kexts). IDA allows loading only the kernel, a single kext or the kernel with all its kexts. As we don't know yet where the vulnerabilities are located, let's load all the things!

Once IDA auto analysis has finished, we can run bindiff in the 12.4.8 IDA instance against the 12.4.9 IDB, and here are the results sorted by similarity:

Bindiff results
Bindiff results between 12.4.8 and 12.4.9 kernels

These results are beyond all expectations! There are only 8 functions slightly changing between the two versions, all in the kernel!

Among these 8 results, 2 are actually minor instructions ordering changes. In the 6 remaining ones, 5 of them have an added call to bzero, which make them the perfect candidates for a memory leak vulnerability involving a "memory initialization issue" :)

Added bzero call
Added bzero call

iOS kernelcaches usually lack symbols, but some entry points such as mach traps can be easily identified, using e.g. joker tool. Debug strings along with public XNU sources also allow renaming many functions, and we could identify the 5 patched functions as:

  • mach_msg_send
  • mach_msg_overwrite
  • ipc_kmsg_get
  • ipc_kmsg_get_from_kernel
  • ipc_kobject_server

All these functions deal with ipc_kmsg objects. kmsg objects are the kernel representation of mach messages and are a complex aggregate of structures. Looking at the source code of these functions, the bzero call can be linked to the initialization of kmsg trailers.

Down the ipc_kmsg trailer rabbit hole

Trailers are structures with a dynamic size depending on their type. The tiniest trailer is an 8-bytes structure containing nothing but the type and size, whereas the biggest one is 0x44 bytes long and has several fields, as seen in the following extract from XNU source code:

When creating a new kmsg, the kernel does not know yet which trailer type will be requested when receiving the message. It thus reserves the biggest size, initializes some fields, and sets the type to the smallest one. For example, the trailer initialization in ipc_kmsg_get is:

This looks interesting! If we're able to read a mach message asking for a longer trailer than expected, we might retrieve uninitialized chunks of memory.

When reading a mach message using mach_msg(), the execution flow in kernel-land to reach the trailer copyout is:

  • mach_msg_trap
    • mach_msg_overwrite_trap
      • mach_msg_receive_results
        • ipc_kmsg_add_trailer

In ipc_kmsg_add_trailer(), the output trailer size is calculated:

  • In [1], a new trailer is used on the stack.
  • In [2], the kmsg trailer content is copied in the new trailer.
  • In [3], option argument is checked against MACH_RCV_TRAILER_MASK. This option parameter comes from the option parameter passed to mach_msg() in userland.
  • In [4], the real trailer size is calculated using macro REQUESTED_TRAILER_SIZE().

By providing an option matching MACH_RCV_TRAILER_MASK to mach_msg(), we can ask the kernel to return a specific trailer size. The supported options are defined in message.h:

Thus, we can call mach_msg() with e.g. MACH_RCV_TRAILER_ELEMENTS(MACH_RCV_TRAILER_AUDIT) in the option parameter to request a specific trailer size. Now, what happens in ipc_kmsg_add_trailer() when requesting a trailer bigger than the initialized one? In ipc_kmsg_get(), we saw that only msgh_sender, msgh_audit and msgh_labels optional fields were initialized, leaving 3 fields uninitialized.

  • In [1], msgh_seqno and msgh_context are initialized in the trailer copy.
  • In [2], a boolean passed to the function is checked to return early. This boolean is false when called from mach_msg_receive_results().
  • In [3], the function checks if the option passed is greater or equal than MACH_RCV_TRAILER_AV, meaning that we want to retrieve a structure containing at least msgh_ad. If this is the case, msgh_ad is initialized to 0 in the trailer copy.
  • In [4], finally, ipc_kmsg_munge_trailer() copies back the msgh_seqno, msgh_context, msgh_trailer_size and msgh_ad from the trailer copy to the original trailer.

A high level observation does not reveal any bug here, all the fields seem to have been correctly initialized before being returned to userland. However, let's have a look at how the trailer size is really computed by the REQUESTED_TRAILER_SIZE() macro:

This macro returns the correct size when the option value is known, and the maximum size when it is not. This means that by setting a non-existent option lower than MACH_RCV_TRAILER_AV, we can skip the msgh_ad field initialization, while still recovering the biggest possible trailer. This bug is made possible by the fact that values 5 and 6 are not valid MACH_RCV_TRAILER_XXX definitions!

To illustrate this behavior, we can write a simple proof of concept reading a known value from uninitialized memory. In iOS before 13.x, pipe buffers and ipc_kmsg can be allocated in the same kalloc area, as there is no separated heaps before iOS 14. Thus, we can create a pipe buffer filled with a known value (in e.g. kalloc.1024 zone), free it, then send a mach message which size will make it also allocated in kalloc.1024, and finally trigger the vulnerability to read back the known value. Here is the code (github link):

The PoC produces the following output, effectively leaking our magic controlled value:

What about leaking a nice kernel pointer?

Leaking a known value proves the vulnerability existence. However, using it to reliably leak an interesting value is usually harder.

An interesting feature of mach messages is their ability to transport mach port rights. When sending a port right, the mach_msg_port_descriptor_t structure is used:

This structure is different when used in userland or kernel. Indeed, in userland mach_port_t is defined as an unsigned int (an opaque value identifying a port) whereas it is defined to a struct ipc_port pointer in kernel.

This difference means that a mach message sent with multiple mach_msg_port_descriptor_t structures will result in a kernel ipc_kmsg structure containing multiple pointers to ports. Thus, we're able to put interesting data in a kernel buffer we might be able to leak later!

The trick to be able to read part of ipc_port pointers is to send a first message containing X mach_msg_port_descriptor_t, free it, and send another message with X-Y mach_msg_port_descriptor_t, so the allocation is reused and its trailer is written where the previous message descriptors were laying. The number of descriptors sent have to be adjusted to fulfill 2 conditions:

  • the ipc_kmsg allocations should be made in the same kalloc zone ;
  • the difference between X and X-Y descriptors should be sufficient to shift the trailer earlier in the buffer so that it overlaps some previous message descriptors.

In practice, sending 50 descriptors in the first message and 40 descriptors in the second one fulfill the conditions. As the vulnerability only allows leaking 4 bytes of memory, we also need to shift the trailer by steps of 4 bytes. Luckily, we're able to send some padding in a mach message without triggering any problem (as long as we pad with multiples of 4 bytes), allowing us to efficiently shift our leak window.

We still have one step to complete: being able to free the ipc_kmsg buffer containing the kernel pointer. If we try to read the message normally, the pointers will be replaced by the userland mach name before being copied back to userland. We thus have to trigger an error to simply free the allocation without triggering this behavior.

Here is the final exploit leaking a kernel ipc_port address (github link):

The exploit should produce the following output when executed on iOS prior to 14.2:

Conclusion

In this blogpost, we investigated a patched iOS kernel to retrieve details about a patched kernel memory leak vulnerability. We identified the root cause, wrote a simple PoC and found a method to reliably get a mach port kernel address. It's quite surprising how long this vulnerability has survived in XNU knowing that the code is open source and heavily audited by hundreds of hackers.

The attentive reader would have noticed that we didn't detail the other patched vulnerability, identified as a type confusion by Apple. While the fix is quite easy to find with bindiff, its analysis is not so trivial, and might be the subject of a future blogpost if we get enough time to dig into!