HYP3RV3LOCITY: Introduction to Stack Buffer Overflows: Overwriting Data

Introduction to Buffer Overflows

In the previous post, we talked about the stack layout on x86 processors and the x86 general registers and instruction set. In this post, we will learn how stack buffer overflows occur and how we can exploit this vulnerability to overwrite data in memory.

A buffer is a generic term for a block of data storage in memory. A buffer overflow is a condition that occurs when when we put more data into the buffer than that buffer can hold. The extra data overflows into the next region of memory, and this will usually cause the program to crash. However, sometimes, it is possible to overflow into a specific region of memory with a specific value such that when the computer attempts to use that memory, the data is valid.

An analogy would be a fill-in-the-blank sentence where the blank is our buffer. The neon letters are the characters that are in our buffer and the pink letters are the characters that have overfilled the buffer.
Tell to go to Room 212 in Building A.
An overflow that will crash our program would be something like....
Tell hypervelocityo Room 212 in Building A.
An overflow that will not crash would go something like...
Tell hypervelocity to not go in Building A.
In exploiting a buffer overflow vulnerability, we essentially overflow the buffer in a way that results in the processor still being able to read and write from valid memory. In the following stack overflow example, we will use the buffer overflow vulnerability to write valid data to other variables.

Setting up the Test Environment

Before we begin to examine stack buffer overflows, we'll need to disable some security features in our environment. Because the buffer overflow is such a common vulnerability, systems today have security features that make it difficult to exploit the vulnerability. There are techniques to get around these protections, but for now, we only want to focus on learning the basic exploit method. For this tutorial, you will need to disable address space randomization (ASLR) and then disable function stack protection. In Linux, this command should work on most distributions. If it does not work, then just look look up how ASLR is disabled for your particular distribution.

hypervelocity ~ $ echo 0 > /proc/sys/kernel/randomize_va_space 
hypervelocity ~ $ gcc -m32 -g -O0 -fno-stack-protector -o basicoverflow basicoverflow.c

If you want to enable it again, just echo 1 (or 2) instead.

In OSX, ASLR is disabled by linking with the no-PIE linker option.

hypervelocity ~ $ gcc -m32 -g -O0 -Wl,-no_pie -fno-stack-protector -o basicoverflow basicoverflow.c

In both Linux and OSX, when we compile with GCC (or Clang), we use the fno-stack-protector option, which disables a security feature that adds extra code to the executable in order to check some functions for buffer overflow occurrences. However, not all distributions will have this feature enabled by default since the extra code may impact performance.

Also, if you are not using an x86 processor and are using x64 (aka x86_64, amd64) instead, you will want to compile with the m32 option. The reason for this is because we are using the x86 instruction set, which is similar to x64, but has a few differences. Most notably, x86 will have 32 bit addresses (recall that the general registers we talked about in the last post are all 32 bits or 4 bytes) and x64 will have 64 bit addresses.

We can also compile with the -g option, which gives us debugging information, and -O0, which as mentioned in the previous post, disables most optimization. You can alternatively compile with -ggdb, which gives specific gdb options and -Og, which will remove optimization completely.

Stack Buffer Overflows

Now that we have our environment set up, we are ready to start learning about stack buffer overflows. So a stack buffer overflow, or stack overflow, is a type of buffer overflow that occurs specifically on the stack. Below is a program that first declares two integers and a char array, then copies data from stdin into our char array, and then prints the values and locations of the variables. The lines after print the relevant section of the stack.

#include <stdio.h>
 int main(){
  int before = 0xdeadbeef;
  char buffer[16];
  int after = 0xdeadbeef;
  fgets(buffer, 50, stdin);
  printf("[before] - address (%p) value: %08x\n", &before, before);
  printf("[buffer] - address (%p) value: %s", buffer, buffer);
  printf("[after] - address (%p) value: %08x\n", &after, after);

  char * ptr = (char *) (int *)&after; //1 byte pointer to 'after'
  char * stop = (char *) (int *)&before; //1 byte pointer to 'before'
  stop +=8; //offset 8 bytes after the start of 'before'
  for ( ; ptr != stop; ptr++){
   printf("%p: %x\n", ptr, (unsigned char)*ptr);
  }
  return 0;
}

Below are the results of inputting AAAABBBB1234 into stdin. The labels on the right of the stack address print out are comments I've added.

hypervelocity ~ $ gcc -m32 -fno-stack-protector -o basicoverflow basicoverflow.c 
hypervelocity ~ $ ./basicoverflow
AAAABBBB1234
[before] - address (0xbffffb78) value: deadbeef
[buffer] - address (0xbffffb68) value: AAAABBBB1234
[after] - address (0xbffffb64) value: deadbeef
0xbffffb64: ef  <--[after] start
0xbffffb65: be
0xbffffb66: ad
0xbffffb67: de  <--[after] end
0xbffffb68: 41  <--[buffer] start
0xbffffb69: 41
0xbffffb6a: 41
0xbffffb6b: 41      A
0xbffffb6c: 42      B
0xbffffb6d: 42
0xbffffb6e: 42
0xbffffb6f: 42      B
0xbffffb70: 31      1
0xbffffb71: 32      2
0xbffffb72: 33      3
0xbffffb73: 34  <-- 4     {end of user input}
0xbffffb74: a
0xbffffb75: 0   <--[buffer] end
0xbffffb76: e5
0xbffffb77: 8f
0xbffffb78: ef  <-- [before] start
0xbffffb79: be
0xbffffb7a: ad
0xbffffb7b: de  <-- [before] end
0xbffffb7c: 0
0xbffffb7d: 0
0xbffffb7e: 0
0xbffffb7f: 0

Hexadecimal Numbers, Little Endian

In the program, we printed the addresses of the variables along with their values. We also printed a series of addresses along with their values in hexadecimal. At each address is one byte. A two digit hexadecimal number represents a byte or 8 bits because the max value, FF is 255 which is equal to the max value of the 8-digit binary number, 11111111, where each of those digits represents a single bit. There are no half bytes (here). The single digits you see have a trailing 0 in front. For example, the value at 0xbffffb74 is 0a.

In the first four lines of the printed address, we have the variable [after]. Notice that the address for [after] is 0xbffffb64 and that it contains the last byte of its value: 'ef'. On Intel architectures, the bytes are arranged in little endian meaning that the least significant byte is at the lowest address. So for the value 0xdeadbeef, instead of having \xDE\xAD\xBE\xEF, we will have \xEF\xBE\xAD\xDE. This tells us that when variables take up more than one byte, the address of the variable refers to its least significant byte, which is located at its lowest address, and the rest of the data is adjacent in higher addresses.

One way to think about little endian is to remember that the stack grows towards lower addresses. So if the computer were moving bytes one by one, it would start with [before] and it would move \xDE onto the stack first. Then it would proceed to moving \xAD, \xBE, then \xEF, with each successive byte being at a lower address.

The char array differs from the variables in that its user-inputted value is not reversed. The beginning of the user input occurs right after the end of [after] and buffer grows down the stack towards higher addresses. (Our user-inputed characters, 'AAAABBBB1234', are represented with their ASCII values, which is why we see 31, 32... instead of the numerical values 1,2.)

Notice the lowercase 'a' and the '0' where I've labeled [buffer] end. The 'a' is ASCII for the newline character ('\n'), and it is there because fget appends a newline character. The '0' is a null byte character, ('\0'). It signals to the processor that it has reached the end of our char array. So actually, even though we are suppose to have 16 bytes allocated for the char array, we only have 14 bytes for user-input because the newline and null byte take up two bytes.

Four bytes above the last address printed, we we see the variable [before]. The values printed in the addresses between the end of the buffer and the start of [before] are just garbage values that were left on the stack from some other operation/program.

From looking at the memory, we can tell that variables are right next to each other. Even though we did not use the entirety of [buffer], the location of the variables will not change regardless of what we input. Just to check that they are indeed right next to each other, we can do some hex math use the handy POSIX bc calculator.

hypervelocity ~ $ echo "ibase=16; B78-B64" | bc
20
hypervelocity ~ $ echo "ibase=16; B78-B68" | bc
16
hypervelocity ~ $ echo "ibase=16; B68-B64" | bc
4

Alternatively, use python

hypervelocity ~ $ python -c 'print 0x78-0x68'
16
hypervelocity ~ $ python -c 'print 0x68-0x64'
4

Pointer Type Casting and Arithmetic

Pointers are important to understand and pretty interesting. This section is intended to clarify these lines from the program we're using in this post.

//...
char * ptr = (char *) (int *)&after; //1 byte pointer to 'after'
char * stop = (char *) (int *)&before; //1 byte pointer to 'before'
stop +=8; //offset 8 bytes after the start of 'before'
for ( ; ptr != stop; ptr++){
 printf("%p: %x\n", ptr, (unsigned char)*ptr);
}
//...

What's happening here is that we have integer variables [after] and [before]. We want to print one byte addresses starting from [after] and ending a few bytes after [before]. If we were to assign int * pointers to the addresses, then when we increment these int pointers in the loop, they would increase by four bytes because they are expecting int, which are 4 bytes. But if we used char (or void) pointers, we only increment by 1 byte.

The important thing to understand in the assignments taking place on the first two lines of this segment is that the addresses that [ptr] and [stop] get from [after] and [before] are the same regardless of whether they are char pointers or int pointers. The only thing that changes is the value we get when we dereference the pointers. When we dereference the char pointer to the address of an integer, we will get the lower byte of the integer. In the integer [after], for example, we will get 0xef, which we can confirm since 0xef is the first byte printed by our program when it prints the stack memories. The pointer's type just tells the computer how many bytes up the stack from the address it needs to read.

Overflowing into other Variables

Now that we have have an idea of how data is arranged in memory, we can start overflowing our char array. A convenient way to do this on the command line so that we don't have to type out letters is to use python or perl to output the characters we want. The below command prints a 16 length string of "A" characters using python 2 and python 3 respectively.

hypervelocity ~ $ python -c 'print "A"*16'
AAAAAAAAAAAAAAAA
hypervelocity ~ $ python -c 'print("A"*16)'
AAAAAAAAAAAAAAAA

We will pipe the result of our command into the program:

hypervelocity ~ $ (python -c 'print "A"*16') | ./basicoverflow
[before] - address (0xbffffb78) value: dead000a
[buffer] - address (0xbffffb68) value: AAAAAAAAAAAAAAAA
[after] - address (0xbffffb64) value: deadbeef
0xbffffb64: ef
0xbffffb65: be
0xbffffb66: ad
0xbffffb67: de
0xbffffb68: 41
0xbffffb69: 41
0xbffffb6a: 41
0xbffffb6b: 41
0xbffffb6c: 41
0xbffffb6d: 41
0xbffffb6e: 41
0xbffffb6f: 41
0xbffffb70: 41
0xbffffb71: 41
0xbffffb72: 41
0xbffffb73: 41
0xbffffb74: 41
0xbffffb75: 41
0xbffffb76: 41
0xbffffb77: 41
0xbffffb78: a
0xbffffb79: 0
0xbffffb7a: ad
0xbffffb7b: de
0xbffffb7c: 0
0xbffffb7d: 0
0xbffffb7e: 0
0xbffffb7f: 0

Notice here that our value in [before] reads dead000a instead of deadbeef. We have essentially overflowed the newline and null byte into the [before] variable. Even though [before] is changed, [after] is not changed, which tells us that the overflow will only happen in variables that are declared before the buffer. This is consistent with what we discussed before regarding how user input in the buffer grows down the stack and that variables before the buffer will be allocated on the stack first giving it a higher address.

Since we have currently outputted 16 bytes in our buffer and this has caused [before] to lose 2 bytes from being overwritten, we can conclude that we need another two bytes to overwrite [before] completely or another 4 bytes to overwrite [before] with the last four bytes of user-input. We've also found out from subtracting the addresses that [before] and [buffer] are 16 bytes away. To make things more clear, we can fill the last four bytes of our buffer with B instead of A so that we can see [before] get overwritten with BBBB.

hypervelocity ~ $ (python -c 'print "A"*16 + "B"*4') | ./basicoverflow
[before] - address (0xbffffb78) value: 42424242
[buffer] - address (0xbffffb68) value: AAAAAAAAAAAAAAAABBBB
[after] - address (0xbffffb64) value: deadbeef
0xbffffb64: ef
0xbffffb65: be
0xbffffb66: ad
0xbffffb67: de
0xbffffb68: 41
0xbffffb69: 41
0xbffffb6a: 41
0xbffffb6b: 41
0xbffffb6c: 41
0xbffffb6d: 41
0xbffffb6e: 41
0xbffffb6f: 41
0xbffffb70: 41
0xbffffb71: 41
0xbffffb72: 41
0xbffffb73: 41
0xbffffb74: 41
0xbffffb75: 41
0xbffffb76: 41
0xbffffb77: 41
0xbffffb78: 42 <--
0xbffffb79: 42   
0xbffffb7a: 42   
0xbffffb7b: 42 <--
0xbffffb7c: a
0xbffffb7d: 0
0xbffffb7e: 0
0xbffffb7f: 0

So notice the that [before] now contains 42424242 which is ASCII for BBBB. This tells us that we can change the values of variables declared before our vulnerable buffer. Knowing this, we can extend the idea and conclude that we are able to write to any address higher than the address of our buffer. However, in this program and with this exploit method, we can only write up to 50 bytes since we are limited by the 50 byte parameter of fget.

Conclusion

In this post, we learned more about how data is arranged in memory, and we introduced the stack buffer overflow and how we can write to certain variables of our choice via buffer overflow. In the next post, we will see how stack buffer overflow vulnerabilities can allow us to overwrite the return address of a function with an address of our choice and how we can spawn a shell from that. If you have any questions, suggestions, or corrections, feel free to leave a comment below. Thanks for reading!

HYP3RV3LOCITY

Sunday, July 24, 2016

Introduction to Stack Buffer Overflows: Overwriting Data