BFS 2019 Exploitation Challenge

On September 7th, 2019, BFS published an exploitation challenge on Windows 10 x64 to win an entry for the BFS-IOACTIVE party during the Ekoparty conference. This blogpost aims at describing a successful resolution of the challenge.

Context

The challenge is a single Windows 64-bits binary spawning a service listening on port 54321.

The rules consist in:

  • Exploiting the binary to pop a "calc.exe" or "notepad.exe", by defeating ASLR remotely;
  • Ensuring the service continues to work flawlessly.

To be accepted, the exploitation code should be written in Python and work against the latest Windows 10 64-bits version (19H1, aka 1903).

Reverse engineering the service

The binary is quite small, and only consists in a few functions.

In the main function, the following actions are performed:

  • If the number of arguments passed to the binary is 0, WinExec is called with argv[0] as argument. This condition should never be true in normal cases, so this call has been put only to help resolving the challenge;
  • A table (later called init_table) of 256 QWORD is initialized with value 0x488B01C3C3C3C3 ORed with the table index as the most-significant byte;
  • Finally, a socket is bound to port 54321 and a connection handler is called for each new connected client.

The table is quite interesting, since value C3 is the encoding for the RET instruction, and 48 8B 01 encodes a MOV RAX, [RCX]. Let's keep this information for later.

On the connection handler side:

  • Two stack variables are initialized with an address in the .data (later called data_pointer) section and value 0x3e (later called offset);
  • A 0x10 bytes header is then read from the socket. This header has the following structure:
struct header {
    dq magic;
    dd pktSize;
    dd unknown;
};
  • header.magic is checked to match Eko2019\0;
  • header.pktSize is checked to be lower than 512;
  • header.pktSize bytes are then read from the socket to a 512-bytes stack buffer.

When looking at what happen to the received data, we immediately know we face a challenge, the behavior being very unusual:

  • The received buffer content is displayed on stdout using printf(" [+] Remote message (%i): '%s'\n");
  • The printf return value is stored in data_pointer;
  • Entry at index offset is retrieved from init_table, swapped, then written over a place-holder function using WriteProcessMemory;
  • The placeholder function is called with the stack variable containing data_pointer as argument, and its 8-bytes return value is sent back in the socket.

As the default offset used in the function is 0x3E, the placeholder function is overwritten with bytes 3E 48 8B 01 C3 C3 C3 C3. 3E is a well known x86 segment override prefix: it makes the memory access instruction use the DS segment. The placeholder function code is thus:

0x0:    mov     rax, qword ptr ds:[rcx]
0x4:    ret     
0x5:    ret     
0x6:    ret     
0x7:    ret 

Looking for a vulnerability

When looking at how the size is checked and used, one can notice a bad behavior: size_check.png

  • size is checked to be lower than 512 using the jle instruction, which implies a signed comparison;
  • size is then casted as an unsigned word using the movzx instruction.

This is clearly a vulnerability, as we can forge a size field which will pass the comparison (providing a negative 32-bits integer) and will be greater than 512 after the cast. For example, using value 0xffff0400 will pass the check because it represents the signed number -64512, but size 0x400 will then be used for the recv() call.

Exploitation

Defeating ASLR

Just after the fixed-size buffer are the two aforementioned stack variables: offset and data_pointer, which mean we can change both the code executed in the placeholder function and its argument! If we stick to the used offset (3E), changing only the data_pointer provides us with an arbitrary read, which is a very strong primitive for a remote exploit. However, we still have to find a suitable memory address to read!

Considering we can chose an arbitrary x86 prefix for the executed mov instruction, we can change offset to use the GS override prefix, changing the called function to:

0x0:    mov     rax, qword ptr gs:[rcx]
0x4:    ret     
0x5:    ret     
0x6:    ret     
0x7:    ret 

On 64-bits Windows, GS segment points to the Thread Information Block, a data structure containing various information about the currently running thread, with interesting values from an exploiter point of view:

  • The stack base value at offset 0x8, giving us a pointer to the stack;
  • The PEB address at offset 0x60, from which we can read the process ImageBaseAddress at offset 0x10.

From now, we have successfully defeated ASLR as we are able to read the whole binary memory space as well as the current thread stack.

Code execution

Instead of changing the byte before the move instruction to a x86 prefix, we can change it to a 1-byte instruction to try to change the execution flow. Considering that we control the function first parameter (passed in rcx register), we can use the offset value 51, changing the function to:

0x0:    push    rcx
0x1:    mov rax, qword ptr [rcx]
0x4:    ret 
0x5:    ret 
0x6:    ret 
0x7:    ret

We thus can redirect execution to the address contained in rcx. We just need to find a suitable gadget to make rsp point to our stack-buffer, which contains data we control. Such a gadget will allow us to execute a ROP-chain.

Stack pivot and ROP chain

When we call the placeholder function, the stack-buffer we control is 0x60 bytes above. We can thus find a simple gadget adjusting rsp to fall in our buffer. Such a gadget can be found at offset 0x1aea in the binary:

.text:0000000140001AEA                 add     rsp, 88h
.text:0000000140001AF1                 retn

For the ROP-chain, all we need is returning to main with controlled arguments on the stack to be able to call WinExec.

The first problem we face is that we need a gadget to set rcx and rdx, which are arguments argc and argv for function main. This can easily be circumvented by calling main+0x9, since the two first instructions copy these registers on the stack:

.text:0000000140001410 main            proc near
.text:0000000140001410                 mov     [rsp+arg_8], rdx
.text:0000000140001415                 mov     [rsp+arg_0], ecx

We can now provide main arguments by passing them on the stack.

Another thing we need to care about is the stack alignment. On Windows, it is assumed that the stack pointer is aligned on 0x10 after a function prologue. If the stack is not aligned, some compiler optimizations such as the use of XMM registers will trigger an access violation.

Finally, we need to craft the fake argv to have argv[0] pointing to a string we control. This can be done by scanning the stack to retrieve our buffer ("egg hunting" to find a magic value), and then crafting the fake array in the buffer. All the pieces (ROP-chain, fake argv, command to execute) can be put in the stack buffer.

Service continuation

After the WinExec execution, the main function will continue to execute until it tries to bind the socket a second time and fail. It will then return, using the next entry in our ROP-chain.

Now, all we have to do is chaining gadgets incrementing the stack pointer to make it point to the return address of the connection handler. The process will then return to main, just as if the connection handler terminated successfully, and will accept new connections from clients.

Final exploit

The final exploitation code is available on our GitHub, feel free to drop us an e-mail if you have a different exploitation technique!