Treasure Chest Party Quest: From DOOM to exploit

Written by Tristan Pourcelot, Rémi Jullian - 25/11/2020 - in Hardware, Exploit, Systems - Download

In this blogpost, we will find what happens when two security researchers find a random printer and then manage to find vulnerabilities in it.

Looking to improve your skills? Discover our trainings sessions! Learn more.

Context and objectives

From an attacker point of view, gaining code execution on a printer connected to the LAN can be interesting for several reasons:

It can provide a long-term persistence mechanism, as printers are less likely to be re-installed than workstation
It can be used to perform lateral movement within the internal network
It can give access to sensitive documents that may be scanned and printed, but never stored on a workstation

Security researchers from Contextis managed to run the famous FPS video game Doom on a Canon MG6450 printer, as shown in 1. They exploited weaknesses within the encryption algorithm, used to encrypt newer firmware versions, in order to craft and deploy a custom firmware. Based on their work, we managed to obtain the firmware used by Canon Printer from the MX920 series, such as the Pixma MX925:

Thus, we spent a week, not trying to run a video game on a printer, but rather trying to find vulnerabilities that may be triggered from a compromised workstation, sharing the same LAN as the targeted printer.

In this blogpost, we’ll explain the firmware update mechanism, highlight the operating system behind Canon firmware, and talk about the attack surface and vulnerabilities that we found.

Firmware analysis

Firmware download over HTTP

MX920 series firmware can be updated manually, through the dedicated section on the web-interface, used to configure and manage the printer. The following URL is hardcoded in the firmware, and is used to download an XML file containing update information in order to obtain the latest firmware for a specific model:

http://gdlp01.c-wss.com/rmds/ij/ijd/ijdupdate/176b.xml

curl http://gdlp01.c-wss.com/rmds/ij/ijd/ijdupdate/176b.xml
<?xml version="1.0" encoding="UTF-8" ?>
<update_info>
<version>3.020</version>
<url>http://pdisp01.c-wss.com/gdl/WWUFORedirectTarget.do?id=MDQwMDAwNDgwNjAx&cmp=Z01&lang=EN</url>
<size>37127366</size>
</update_info>

The ID used in the URL, 176b, looks like the USB Product ID, and is used by Canon to reference a unique model. As shown on https://devicehunt.com/view/type/usb/vendor/04A9, it is related to PIXMA MX920 Series.

If the version specified in the XML file is newer than the current one, a second URL (extracted from the XML) is used to obtain an HTTP redirection leading to the final URL in order to get the new firmware.

curl "http://pdisp01.c-wss.com/gdl/WWUFORedirectTarget.do?id=MDQwMDAwNDgwNjAx&cmp=Z01&lang=EN"
<html><head><title>302 Moved Temporarily</title></head>
<body bgcolor="#FFFFFF">
<p>This document you requested has moved temporarily.</p>
<p>It's now at <a href="http://gdlp01.c-wss.com/gds/6/0400004806/01/176BV3020AN.bin">http://gdlp01.c-wss.com/gds/6/0400004806/01/176BV3020AN.bin</a>.</p>
</body></html>

It’s interesting to note that the firmware doesn’t embed an HTTP client like curl or wget but rather implement a custom one, using low level sockets (User-Agent "IP Client/1.0.0.0").

Decrypting the firmware

As documented in Contextis's research 1, the firmware update file is ciphered using a XOR based custom scheme.

Reimplementing Contextis cleartext attack was just a matter of writing a script and analyzing the XOR patterns.

The script used for this attack is available in our Github repository at https://github.com/synacktiv/canon-tools

We are aware that newer printers released by Canon are fitted with firmware on which this attack doesn’t work anymore.

Decompressing the main firmware

The decrypted firmware is a bootloader used to decompress and run the main firmware (ARM code). The first step is thus to take a look at the bootloader in order to find the decompression routine.

After identifying a few functions manually, IDA disassembly engine starts to pay off and few functions are automatically discovered. This one grabbed our attention:

_BYTE *__fastcall small_decompress_routine(_BYTE *dictionnary, _BYTE *dest, int uncompressed_length)
{
  _BYTE *end; // r2
  int first_byte; // r3
  int same_data_count; // r4
  int chunk_size; // r5
  int i; // r4
  char tmp_same_byte; // r6
  int v9; // r4
  unsigned int off_; // r3
  _BYTE *src_start; // r4
  char *src; // r4
  int chunk_size_; // r3
  char byte; // r6

  end = &dest[uncompressed_length];
  do
  {
    first_byte = (unsigned __int8)*dictionnary++;
    same_data_count = first_byte & 3;
    if ( (first_byte & 3) == 0 )
      same_data_count = (unsigned __int8)*dictionnary++;
    chunk_size = first_byte >> 4;
    if ( !(first_byte >> 4) )
      chunk_size = (unsigned __int8)*dictionnary++;
    for ( i = same_data_count - 1; i; --i )
    {
      tmp_same_byte = *dictionnary++;
      *dest++ = tmp_same_byte;
    }
    if ( chunk_size )
    {
      v9 = (unsigned __int8)*dictionnary++;
      off_ = (unsigned int)(first_byte << 28) >> 30;
      src_start = &dest[-v9];
      if ( off_ == 3 )
        off_ = (unsigned __int8)*dictionnary++;
      src = &src_start[-256 * off_];
      chunk_size_ = chunk_size + 1;
      do
      {
        byte = *src++;
        *dest++ = byte;
        --chunk_size_;
      }
      while ( chunk_size_ >= 0 );
    }
  }
  while ( dest < end );
  return dictionnary;
}

It implements a small decompression routine, based on a dictionary, similar to the LZ algorithm.

By looking at the calling function (X-Ref), we can identify that:

The dictionary is located at 0x043ff000
The uncompressed firmware will be stored at 0x0x1DF9DE00
The uncompressed firmware size is 0x108A78

As we identified the dictionary at offset 0x3FF000, relative from the start of the ROM, we can deduce that the base address of the bootloader is 0x04000000.

A Unicorn 2 script was written to emulate the decompression routine and get the uncompressed firmware:

#!/usr/bin/env python3

from unicorn import *
from unicorn.arm_const import *

def hook_code(mu, address, size, user_data):  
    if address == 0x04220384:
        R1 = mu.reg_read(UC_ARM_REG_R1)
        R2 = mu.reg_read(UC_ARM_REG_R2)
        print('Uncompressed bytes %x / %x' % (R1, R2))

BASE = 0x04000000
STACK_ADDR = 0xFFFFFFFF
STACK_SIZE = 2 * 1024 * 1024 # 2 MB stack size
FW_PATH = 'firmware/176BV3020AN_decrypted-fixed.bin'

mu = Uc(UC_ARCH_ARM, UC_MODE_ARM|UC_MODE_THUMB)

with open(FW_PATH, 'rb') as f:
    fw_data = f.read()

# Map stack
mu.mem_map(STACK_ADDR + 1 - STACK_SIZE, STACK_SIZE)

# Map firmware at 0x04000000
mu.mem_map(BASE, 16*1024*1024) # 16MB
mu.mem_write(BASE, fw_data)

# 0x1DF9DE00: address of decompression buffer of size 0x108A780
mu.mem_map(0x1DF9DE00 & (~(0x1000-1)) , (0x108A780 & (~(0x1000-1))) + 0x2000)

mu.hook_add(UC_HOOK_CODE, hook_code)

mu.reg_write(UC_ARM_REG_SP, STACK_ADDR & (~(0x1000-1)))

decompression_routine = 0x04220998+1
decompression_routine_end = 0x042209ae
mu.emu_start(decompression_routine, decompression_routine_end)

with open('firmware/176BV3020AN_decrypted-uncompressed.bin', 'wb') as f:
    memory = mu.mem_read(0x1DF9DE00, 0x108A780)
    f.write(memory)

Problems encountered during analysis

As the decrypted firmware is just a binary file that can’t be properly parsed by IDA such as an ELF or a PE file, IDA can’t easily recognize data and code. Also, the entry point, base address and memory map of the firmware were unknown. While these problems are common when analyzing firmware, it was sometimes a hindrance to our analysis.

At the end of our reverse-engineering week, we identified at least 58k functions in the MX920 series firmware. With a bit of scripting, we were able to automatically rename a few functions which used debugging primitives.

On reinventing the wheel

During the writing of this blogpost, we discovered that someone (leecher1337) did publish exactly the same kind of decryption and decompression tool 3, several months before we even looked at the printer. So there are now at least two tools allowing you to decrypt Canon Pixma Firmware ¯\_(ツ)_/¯.

Dry-Os

The operating system on the printer is based on a custom Real Time Operating System named “DryOS”:

DRYOS version 2.3, release #0049+SMP

This system is itself based on µITRON, a Japanese RTOS specification, as can be seen in the following string: "ITRON4.0"

This system is used throughout all of the printers firmwares we looked at, but is also used in Canon DSL cameras 4 5.

RTOS Tasks

All of the services provided by the printer firmware are centered around more than 300 concurrent tasks. In the binary, these tasks are defined using the following structure:

struct dry_task {
    int field_0;
    int field_4;
    void *lpTaskFunction;
    int field_C;
    int field_10;
    int field_14;
    char *lpszTaskName;
    int field_1C;
};

Using this structure, we were able to identify the functions responsible for handling the different tasks in the firmware. For example, the following code extract shows the definition of the main HTTP server handler, and two of its workers.

ROM:005E3E8C                 task <0, 0, TSK_HTTPD_sub_24104+1, 0xA, 0x600, 0, aTskhttpd, 1>; 0x4C
ROM:005E3E8C                 task <0, 0, HTTP_WORKER1_sub_26458+1, 0xA, 0x3200, 0, aTskhttpwork0,1>  ; 0x4D
ROM:005E3E8C                 task <0, 0, HTTP_WORKER_2sub_26466+1, 0xA, 0x3200, 0, aTskhttpwork1, 1> ; 0x4E

Attack surface

In order to identify the attack surface, we ran a quick nmap6 check on the printer, which indicated at least 8 listening network services.

Opened TCP ports

nmap -A -p- 192.168.1.36
Starting Nmap 7.80 ( https://nmap.org ) at 2020-08-07 14:00 CEST
Nmap scan report for 192.168.1.36
Host is up (0.047s latency).
Not shown: 65532 closed ports
PORT    STATE SERVICE VERSION
80/tcp  open  http    Canon Pixma printer http config (KS_HTTP 1.0)
|_http-title: Site doesn't have a title.
515/tcp open  printer
631/tcp open  ipp     CUPS 1.4
|_http-server-header: CUPS/1.4
|_http-title: 404 Not Found
Service Info: Device: printer

Service detection performed. Please report any incorrect results at https://nmap.org/submit/ .
Nmap done: 1 IP address (1 host up) scanned in 78.21 seconds

Opened UDP ports

sudo nmap -sU -p- 192.168.1.36
Starting Nmap 7.80 ( https://nmap.org ) at 2020-08-07 14:03 CEST
Nmap scan report for 192.168.1.36
Host is up (0.031s latency).
Not shown: 65528 closed ports
PORT     STATE         SERVICE
68/udp   open|filtered dhcpc
500/udp  open|filtered isakmp
3702/udp open|filtered ws-discovery
5353/udp open          zeroconf
8611/udp open          canon-bjnp1
8612/udp open          canon-bjnp2
8613/udp open          canon-bjnp3
MAC Address: 60:12:8B:68:F8:77 (Canon)

Nmap done: 1 IP address (1 host up) scanned in 65.48 seconds

Custom HTTP Server

The firmware implements its own custom HTTP server. This custom server is recognizable thanks to the particular server string: KS_HTTP.

A shodan 7 lookup can then show us that there are around 3500 of such servers publicly accessible over the Internet.

The web server is handled by one main task, and several workers, which are processing the available web pages using an array of the following structure:

struct web_page_handler {
    void *field_0;
    char *base_uri;
    char *filename;
    void *handler;
    int field_10;
    int field_14;
};

For example, the CGI script /English/pages_WinUS/cgi_oth.cgi, targeted in our exploit, is defined in the handlers array in the following manner:

ROM:0072A808                 web_page_handler <null_byte, aEnglishPagesWi, aCgiOthCgi, \
ROM:0072A808                                   VULN_CGI_OTH_CGI+1, 0, 0>; 0x78

Each of the handlers is using a global shared object to access the request’s data.

BJNP

BJNP is a proprietary protocol, designed by Canon, in order to print documents over the network, and perform LAN service discovery.

Not much resources are available related to this protocol, a good start is the source code of the debian package cups-backend-bjnp.

As this is a proprietary “binary” protocol (i.e handling many “size” fields), it is always a target of choice when looking for Out-Of-Bounds read/write or integer overflow vulnerabilities.

Vulnerabilities

CGI stack buffer overflow

CGI scripts are a prime target for exploitation due to the fact that they parse user input.

This firmware was no exception, as the parsing of textual arguments is lacking length checks.

The vulnerable function is called throughout several CGI handlers, such as cgi_oth.cgi, cgi_ips.cgi or cgi_lan.cgi, thus allowing for several stack based buffer overflows.

The following code extract, taken from the cgi_oth.cgi page handler, illustrates the pattern for this vulnerability.

    char szOutput[128]; // [sp+8h]
    [...]
    lpszInput = lpHTTPObject->vtable->get_param(lpHTTPObject, "OTH_TXT1");
    URL_decode_sub_1E20EFC6(lpszInput, szOutput);

The function URL_decode_sub_1E20EFC6 is responsible for the overflow, as it will copy the arbitrary characters from the parameter inside the provided stack buffer. As it also decodes "%" encoded characters, thus allowing to write arbitrary data in the stack buffer.

int __fastcall URL_decode_sub_1E20EFC6(unsigned __int8 *lpszInput, unsigned __int8 *lpszOutput)
{
  int cur_char; // r0
  char *v5; // r4
  int result; // r0
  char v7[24]; // [sp+0h] [bp-18h] BYREF

  while ( 1 )
  {
    result = *lpszInput;                        // Return when the parameter is finished
    if ( !*lpszInput )
      break;
    cur_char = *lpszInput;
    if ( cur_char == '%' )  { 
    [...] // Convert % encoded characters
    }
    else if ( cur_char == '&' ) { // Terminate the parameter parsing if we attain the & separator
      ++lpszInput; *lpszOutput++ = 0;
    } else {
      if ( cur_char == '+' )  {                  // Replace + by spaces
        ++lpszInput; *lpszOutput = 0x20;
      } else {
        *lpszOutput = *lpszInput++;             // Copy the character
      }
      ++lpszOutput;
}
  }
  *lpszOutput = result;
  return result;
}

A Proof Of Concept code triggering a crash of a targeted printer is contained in the following command:

curl 'http://target/English/pages_WinUS/cgi_oth.cgi' --data 'OTH_TXT2=++++AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’'

Exploitation

Exploiting this vulnerability leads to a remote code execution with full privileges.

While the firmware is lacking protections against exploitation, such as stack cookies, writing a shellcode for exploiting the CGI stack based buffer overflow is a bit more complicated than usual due to the fact that the underlying system differs a lot from usual ones.

As we know our readers like challenges, we leave this as an exercise to them :)

BJNP Out of Bound write

The BJNP protocol is handled by the following tasks:

tskBJNP
tskBJNPPrinterTCP
tskBJNPPrinterUDP
tskBJNPScannerTCP
tskBJNPScannerUDP

The task tskBJNPPrinterTCP initializes a context structure for handling BJNP messages received on TCP port 8611, and then uses socket(), bind(), listen(), accept() to receive incoming connections from BJNP clients. Each client is handled in BJNP_tcp_process_message, which reads a 16 bytes structure on the socket. This structure is defined in cups-backend-bjnp as following:

struct  __attribute__((__packed__)) bjnp_header {
   char BJNP_id[4];              /* string: BJNP */
   uint8_t dev_type;             /* 1 = printer, 2 = scanner */
   uint8_t cmd_code;             /* command code/response code */
   uint16_t unknown1;            /* unknown, always 0? */
   uint16_t seq_no;              /* sequence number */
   uint16_t session_id;          /* session id for printing */
   uint32_t payload_len;         /* length of command buffer */
};

After reading the 16 bytes bjnp_header structure, the magic number is checked, and a dispatch function is called (using a fonction pointer) in order to process the message:


int __fastcall BJNP_tcp_process_message(bjnp_tcp_ctx *ctx)
{
  char *v2; // r1
  void *v3; // r2
  void *v4; // r3
  int v6; // r3

  while ( 1 )
  {
    if ( bjnp_tcp_read_n_bytes(ctx->sockclient, ctx->buff_addr, 16, 0, &ctx->last_read_retval) != 16 )// read a 16 bytes bjnp_header structure
    {
      if ( ctx->last_read_retval < 0 )
        return bjnp_close_context(ctx, v2, (int)v3, (int)v4);
      if ( !ctx->last_read_retval )
        break;
    }
    if ( bjnp_read_magic((unsigned int *)ctx->buff_addr) == 'BJNP' )// check the magic number correctness
    {
      if ( ((int (__fastcall *)(bjnp_tcp_ctx *))ctx->bnjp_callback)(ctx) < 0 ) // call a dispatch function in order to process the message
      {
        if ( ctx->last_read_retval < 0 )
          return bjnp_close_context(ctx, v2, (int)v3, (int)v4);
        if ( !ctx->last_read_retval )
          return sub_1E00106C(ctx, (int)v2, v3, v4);
      }
    }
    else // if the magic number is invalid, reply to the client with a special BJNP message
    {
      qmemcpy(ctx->buff_addr, "BJNP", 4);
      bjnp_build_response_header(ctx->buff_addr, 0x8200, 0);
      if ( NW_send(ctx->sockclient, (int)ctx->buff_addr, 0x10u, 0) < 0 )
      {
        bjnp_error("bjnp_tcp.c", 292, (int)"NW_send error");
        sub_1E1F0DDE(1, "bjnp_tcp.c", 293, v6);
      }
    }
  }
  return sub_1E00106C(ctx, (int)v2, v3, v4);
}

The dispatch function calls several routines according to the command code (cmd_code) specified in the header. An Out-of-band write vulnerability has been identified within 2 routines, called when dev_type is set to 0 and cmd_code is set to either 1 or 2. Here is the code of the routine related to cmd_node 1:


int __fastcall bjnp_tcp_handle_msg_0x01(bjnp_tcp_ctx *ctx)
{
  unsigned int payload_len; // r5
  int v3; // r6

  payload_len = bjnp_read_payload_len((int)ctx->buff_addr);
  bjnp_build_response_header(ctx->buff_addr, 0, 0);
  v3 = bjnp_tcp_send(ctx->sockclient, (int)ctx->buff_addr, 16u);
  if ( bjnp_read_response(ctx, payload_len) != payload_len )
    v3 = -1;
  return v3;
}

The bjnp_read_payload_len function returns the field payload_len, from the header structure filled by BJNP_tcp_process_message. As this size is specified by the TCP client which sent the header, it is entirely controlled. Then, this size is used to specify to bjnp_read_response how many bytes must be read on the socket, and copied within the destination buffer. This gives an out-of-band write primitive as the destination buffer is only 0x6000 bytes long, and the size used to copy is a 32 bit integer. The destination buffer is located in memory at 0x18998160, just after the bjnp tcp structure context. An exploitation scenario could be to override the callback function pointer of the bjnp udp structure context, located near after the destination buffer, at 0x1899E1A8.

Exploitation

We didn’t manage to trigger this vulnerability on a physical device, as our test model (an MX 475 series printer) doesn’t have TCP BJNP ports open by default (cf: Opened TCP ports).

If you have an Canon MX920 series printer at home, feel free to implement a POC and tell us whether you managed to trigger this vulnerability !

Countermeasures

As you'll see in the timeline section, Canon decided not to patch our vulnerabilities. In order to mitigate risks posed by vulnerable devices, we recommend to setup authentication using strong passwords, and if possible to segregate them from the network. We also recommend to keep them updated, when security patches are available. It's worth mentioning that Canon also released a document named "Useful tips for reducting the Risk of Unauthorized Access for Inkjet Printer", available at https://global.canon/en/support/security/pdf/inkjet-printer.pdf.

Conclusion

While it is really fun to play Doom on a printer, using previous research can unlock new quests8 for finding vulnerabilities in printers firmwares, which is equally if not more fun :) .

Timeline

06/07/2020 - Start of the research dedicated to firmware decryption / decompression and attack surface analysis (2 days)
03/08/2020 - Firmware reverse engineering / vulnerability research (5 days)
04/08/2020 - Stack Buffer-overflow vulnerability identified in cgi_oth.cgi
07/08/2020 - Second-hand Canon MX 425 Printer purchased and first vulnerability (CGI stack buffer overflow) confirmed with a POC
11/08/2020 - First mail sent to product-security@canon-europe.com in order to discuss vulnerability reporting process
12/08/2020 - Vulnerability details sent to Canon Europe Product Security
27/08/2020 - New mail sent to Canon as they didn't reply to our vulnerability report
28/08/2020 - Canon Europe Product Security replied that our findings have been forwarded to Canon Inc
17/11/2020 - More than 90 days have been spent since vulnerability details reporting, we asked for an update to Canon Europe Product Security
17/11/2020 - Canon Europe Product Security replied that "Following some investigation, this issue appears to relate to CVE-2013-4615. By following the ‘Useful Tips for Reducing the Risk of Unauthorized Access for Inkjet Printer’ https://global.canon/en/support/security/pdf/inkjet-printer.pdf we believe will mitigate the vulnerability."
17/11/2020 - We notify Canon Europe Product Security that we understand our vulnerability won't be patched and should be mitigated using their recommendations. We also announce that we'll publish our work and findings.
25/11/2020 - CVE-2020-29073 attributed for the Stack Buffer-overflow vulnerability

Treasure Chest Party Quest: From DOOM to exploit

Context and objectives

Firmware analysis

Firmware download over HTTP

Decrypting the firmware

Decompressing the main firmware

Problems encountered during analysis

On reinventing the wheel

Dry-Os

RTOS Tasks

Attack surface

Opened TCP ports

Opened UDP ports

Custom HTTP Server

BJNP

Vulnerabilities

CGI stack buffer overflow

Exploitation

BJNP Out of Bound write

Exploitation

Countermeasures

Conclusion

Timeline

Other publications

Exploring cross-domain & cross-forest RBCD: part 2

Caught in the Octopus Trap: Unauthenticated RCE in Argo CD with CodeQL

Completing Compliance with Evidence : A Bottom-Up Approach to NIS2, DORA, and the Cyber Resilience Act

Contact us

PARIS

TOULOUSE

LYON

RENNES

LILLE

BORDEAUX