--- title: "Basics of Binary Exploitation" date: 2019-08-19T11:53:57+05:30 description: "All you need to get started with binary exploitation" draft: false --- ## Intro into assembly Each personal computer has a microprocessor that manages the computer’s arithmetical, logical, and control activities. Each family of processors has its own set of instructions for handling operations like getting user input, displaying info on screen etc. These set of instructions are called ‘machine language instructions’. A processor can only understand these machine language instructions which is basically 0’s & 1’s. So, here comes the need of our low-level assembly. Assembly can be intimidating so I will sum it up for you and this is (pretty) enough to start pwning some binaries. - In assembly you are given 8-32 global variables of fixed size to work with which are called “registers”. - There are some special registers also. MOst important is “program counter”, which tells the cpu which instruction we’re executing next. This is same as IP(instruction pointer) - don’t get confused. - Technically, all the computation is executed on registers. A 64-bit processor requires 64-bit registers, since it enables the CPU to access 64-bit memory addresses. A 64-bit register can also store 64-bit instructions, which cannot be loaded into a 32-bit register. Therefore, most programs written for 32-bit processors can run on 64-bit computers, while 64-bit programs are not backward compatible with 32-bit machines. - But big programs need more space so they access memory. Memory is accessed by using memory location or through push & pop op. on a stack. - Control flow is handled via altering program counter directly using jumps, branches, or calls. These inst. are called “GOTOs”. - Status flags are generally of 1-bit. They tells about wheather flag is set or reset. - Branches are just GOTOs that are predicated on a status flag, like, “GOTO this address only if the last arithmetic operation resulted in zero”. - A CALL is just an unconditional GOTO that pushes the next address on the stack, so a RET instruction can later pop it off and keep going where the CALL left off. I think this is enough info about assembly and you’re ready to dive into binary exploitation. Wanna learn more then this book is awesome - [here](https://beginners.re/RE4B-EN.pdf) ## Let’s start pwning binaries To start you will need a disassembler(converts 0’s & 1’s [machine code] into assembly) like radare2, IDA, objdump etc. and a debugger(used to debug programs) like gdb, OllyDbg etc. Let’s get started: Here is the code that I wrote and we will try to exploit it. It’s a simple license checker which check two strings. Source will be available on my [github](https://github.com/anon6405/binary_exploit). crackme1.c ```c #include #include int main(int argc, char *argv[]){ if(argc==2){ printf("Checking Licence: %s\n", argv[1]); if(strcmp(argv[1], "hello_stranger")==0){ printf("Access Granted!\n"); printf("Your are 1337 h4xx0r\n"); } else{ printf("Wrong!\n"); } } else{ fprintf(stderr, "Usage: %s \n", argv[0]); return 1; } return 0; } ``` This code is pretty simple and I hope you can understand it. So lets compile it. ```shell $ gcc crackme1.c -o crackme1 ``` Now we will use gdb to debug our program ``` $ gdb crackme1 ``` Now we know that every program has main function. So lets disassemble it. ``` (gdb) disassemble main ``` It will through this: ```shell Dump of assembler code for function main: 0x0000000000001169 <+0>: push %rbp 0x000000000000116a <+1>: mov %rsp,%rbp 0x000000000000116d <+4>: sub $0x10,%rsp 0x0000000000001171 <+8>: mov %edi,-0x4(%rbp) 0x0000000000001174 <+11>: mov %rsi,-0x10(%rbp) 0x0000000000001178 <+15>: cmpl $0x2,-0x4(%rbp) 0x000000000000117c <+19>: jne 0x11e3 0x000000000000117e <+21>: mov -0x10(%rbp),%rax 0x0000000000001182 <+25>: add $0x8,%rax 0x0000000000001186 <+29>: mov (%rax),%rax 0x0000000000001189 <+32>: mov %rax,%rsi 0x000000000000118c <+35>: lea 0xe71(%rip),%rdi # 0x2004 0x0000000000001193 <+42>: mov $0x0,%eax 0x0000000000001198 <+47>: callq 0x1040 0x000000000000119d <+52>: mov -0x10(%rbp),%rax 0x00000000000011a1 <+56>: add $0x8,%rax 0x00000000000011a5 <+60>: mov (%rax),%rax 0x00000000000011a8 <+63>: lea 0xe6b(%rip),%rsi # 0x201a 0x00000000000011af <+70>: mov %rax,%rdi 0x00000000000011b2 <+73>: callq 0x1050 0x00000000000011b7 <+78>: test %eax,%eax 0x00000000000011b9 <+80>: jne 0x11d5 0x00000000000011bb <+82>: lea 0xe67(%rip),%rdi # 0x2029 0x00000000000011c2 <+89>: callq 0x1030 0x00000000000011c7 <+94>: lea 0xe6b(%rip),%rdi # 0x2039 0x00000000000011ce <+101>: callq 0x1030 0x00000000000011d3 <+106>: jmp 0x120c 0x00000000000011d5 <+108>: lea 0xe72(%rip),%rdi # 0x204e 0x00000000000011dc <+115>: callq 0x1030 0x00000000000011e1 <+120>: jmp 0x120c 0x00000000000011e3 <+122>: mov -0x10(%rbp),%rax 0x00000000000011e7 <+126>: mov (%rax),%rdx 0x00000000000011ea <+129>: mov 0x2e6f(%rip),%rax # 0x4060 0x00000000000011f1 <+136>: lea 0xe5d(%rip),%rsi # 0x2055 0x00000000000011f8 <+143>: mov %rax,%rdi 0x00000000000011fb <+146>: mov $0x0,%eax 0x0000000000001200 <+151>: callq 0x1060 0x0000000000001205 <+156>: mov $0x1,%eax 0x000000000000120a <+161>: jmp 0x1211 0x000000000000120c <+163>: mov $0x0,%eax 0x0000000000001211 <+168>: leaveq 0x0000000000001212 <+169>: retq End of assembler dump. ``` This looks ugly right. Well it’s AT&T syntax, change it to intel using: ``` (gdb) set disassembly-flavor intel ``` For permanent change, create ~/.gdbinit and add ``` set disassembly-flavor intel ``` Again disassemble main and you will get a more readable code ```shell Dump of assembler code for function main: 0x0000000000001169 <+0>: push rbp 0x000000000000116a <+1>: mov rbp,rsp 0x000000000000116d <+4>: sub rsp,0x10 0x0000000000001171 <+8>: mov DWORD PTR [rbp-0x4],edi 0x0000000000001174 <+11>: mov QWORD PTR [rbp-0x10],rsi 0x0000000000001178 <+15>: cmp DWORD PTR [rbp-0x4],0x2 0x000000000000117c <+19>: jne 0x11e3 0x000000000000117e <+21>: mov rax,QWORD PTR [rbp-0x10] 0x0000000000001182 <+25>: add rax,0x8 0x0000000000001186 <+29>: mov rax,QWORD PTR [rax] 0x0000000000001189 <+32>: mov rsi,rax 0x000000000000118c <+35>: lea rdi,[rip+0xe71] # 0x2004 0x0000000000001193 <+42>: mov eax,0x0 0x0000000000001198 <+47>: call 0x1040 0x000000000000119d <+52>: mov rax,QWORD PTR [rbp-0x10] 0x00000000000011a1 <+56>: add rax,0x8 0x00000000000011a5 <+60>: mov rax,QWORD PTR [rax] 0x00000000000011a8 <+63>: lea rsi,[rip+0xe6b] # 0x201a 0x00000000000011af <+70>: mov rdi,rax 0x00000000000011b2 <+73>: call 0x1050 0x00000000000011b7 <+78>: test eax,eax 0x00000000000011b9 <+80>: jne 0x11d5 0x00000000000011bb <+82>: lea rdi,[rip+0xe67] # 0x2029 0x00000000000011c2 <+89>: call 0x1030 0x00000000000011c7 <+94>: lea rdi,[rip+0xe6b] # 0x2039 0x00000000000011ce <+101>: call 0x1030 0x00000000000011d3 <+106>: jmp 0x120c 0x00000000000011d5 <+108>: lea rdi,[rip+0xe72] # 0x204e 0x00000000000011dc <+115>: call 0x1030 0x00000000000011e1 <+120>: jmp 0x120c 0x00000000000011e3 <+122>: mov rax,QWORD PTR [rbp-0x10] 0x00000000000011e7 <+126>: mov rdx,QWORD PTR [rax] 0x00000000000011ea <+129>: mov rax,QWORD PTR [rip+0x2e6f] # 0x4060 0x00000000000011f1 <+136>: lea rsi,[rip+0xe5d] # 0x2055 0x00000000000011f8 <+143>: mov rdi,rax 0x00000000000011fb <+146>: mov eax,0x0 0x0000000000001200 <+151>: call 0x1060 0x0000000000001205 <+156>: mov eax,0x1 0x000000000000120a <+161>: jmp 0x1211 0x000000000000120c <+163>: mov eax,0x0 0x0000000000001211 <+168>: leave 0x0000000000001212 <+169>: ret End of assembler dump. ``` Now make a assumption how this binary works. When you run it without any argument it will display the usage message. If you pass two arguments where first one is program name itself and second one is license key, it will display a access granted or access denied message. Now apply that assumption to assembly code. For exploitation, we can ignore most of the stuff. So at 0x1178, you can see a cmp function which is comparing a pointer to hex 0x2(which is 2 in decimal). According to our assumption, that must be checking arguments. Just below that 0x117c have a jne(basically jump not equal). So if those strings don’t match, control flow will jump to addr 0x11e3. Now at addr 0x1198, it is calling a printf function, which maybe printing “Checking License:” when you run the binary. Next interesting addr is 0x11b2, it is calling a strcmp(string compare) function. It should be comparing our key with the correct key to verify. Next we have 0x11b7 which is a test function and returns value 0 if strings match. After that we have addr 0x11b9 which is jne(jump not equal), jumps to addr 0x11d5 if strings are not equal. After that we have 0x11c2 and 0x11ce which is calling a puts(it just prints stuff) function, this will print “Access Granted!” and some other text if we give correct key. Next is 0x11d3 which will jump to 0x120c and terminates our program. Now let’s exploit it using gdb to print access granted without using key. First set breakpoint at main. Breakpoint is a point in memory where your execution stops. ``` (gdb) break *main ``` Now run the program and watch the control flow. You can use pen-paper for better understanding. ``` (gdb) run (gdb) ni ``` ni is to execute next instruction. After that just press enter and it will execute the next instruction. Now try running the program with a key. ``` (gdb) run random_key (gdb) ni ``` Carefully watch the control flow this time. Now according to our assumption, if we change the value of eax at addr 0x11b7, we are telling the program that the strings matched and it will print the access granted message. So for that set breakpoint 2 to the address of test eax, eax. ``` (gdb) disass main (gdb) break *0x00005555555551b7 ``` Again run the program with a random_key. ``` (gdb) run random_key ``` After hitting the first breakpoint, type continue to jump to next breakpoint. ``` (gdb) continue (gdb) info registers (gdb) set $eax=0 (gdb) ni ``` Here I set the value of eax to 0 and run the program instruction by instruction. After setting eax=0, next addr 0x00005555555551b9 will not be executed as it is jne. Use ni to continue executing next instruction. ```shell (gdb) run random_key The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: crackme1 random_key Breakpoint 1, 0x0000555555555169 in main () (gdb) continue Continuing. Checking Licence: random_key Breakpoint 2, 0x00005555555551b7 in main () (gdb) info registers rax 0x3 3 rbx 0x0 0 rcx 0xfff7fdff 4294442495 rdx 0x68 104 rsi 0x55555555601a 93824992239642 rdi 0x7fffffffe563 140737488348515 rbp 0x7fffffffe170 0x7fffffffe170 rsp 0x7fffffffe160 0x7fffffffe160 r8 0xffffffff 4294967295 r9 0x1d 29 r10 0xfffffffffffff1a9 -3671 r11 0x7ffff7f36140 140737353310528 r12 0x555555555070 93824992235632 r13 0x7fffffffe250 140737488347728 r14 0x0 0 r15 0x0 0 rip 0x5555555551b7 0x5555555551b7 eflags 0x206 [ PF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) set $eax=0 (gdb) info registers rax 0x0 0 rbx 0x0 0 rcx 0xfff7fdff 4294442495 rdx 0x68 104 rsi 0x55555555601a 93824992239642 rdi 0x7fffffffe563 140737488348515 rbp 0x7fffffffe170 0x7fffffffe170 rsp 0x7fffffffe160 0x7fffffffe160 r8 0xffffffff 4294967295 r9 0x1d 29 r10 0xfffffffffffff1a9 -3671 r11 0x7ffff7f36140 140737353310528 r12 0x555555555070 93824992235632 r13 0x7fffffffe250 140737488347728 r14 0x0 0 r15 0x0 0 rip 0x5555555551b7 0x5555555551b7 eflags 0x206 [ PF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) ni 0x00005555555551b9 in main () (gdb) 0x00005555555551bb in main () (gdb) 0x00005555555551c2 in main () (gdb) Access Granted! 0x00005555555551c7 in main () (gdb) 0x00005555555551ce in main () (gdb) Your are 1337 h4xx0r 0x00005555555551d3 in main () (gdb) ``` Voila! You have cracked the program without knowing the correct key. This one is just a basic intro into binary exploitation and enough to get you started.