#30 | bin05 – Diving deep into the heap [1/2]

Hey. Today I’ll still bore you with computer security and talk about a subject I really wanted to learn since a while (over 2 years now) but never managed to either find time, or the motivation to get started.

I’ll talk about the heap and how it works, the objective of the next subjects will be to exploit an use-after-free to get remote code execution on PHP. Even if I think my best skills are web-based, a lot of vulnerabilities linked to unserialize and use-after-free were released for PHP and I was never able to exploit them because of my lack of knowledge about binary, and more specifically, heap exploitation. Here’s what I think I’ll need to talk about:

  • How does a memory allocation work (glibc malloc – ptmalloc2, dlmalloc?, …)
  • Off-by-one in the heap
  • Use-after-free as a general concept and exploitation
  • Learning about PHP internal variable structures
  • Use-after-free and exploitation in the context of PHP using ROP then returning to internal PHP functions (see the work of Stefan Esser)

 

What is the heap?

The heap is the memory section where dynamic allocation takes place. Contrary to the stack on which you needs to specify the size of the data you will be using, the heap is a more versatile piece of memory. The heap will grow upward while the stack grows downward. The upper part of the heap is called the wilderness or top chunk and contains the part of memory that can be allocated by the heap.

There are other sections I’ll speak about later (such as the bins/fastbins, used when freeing memory).

There are a lot of memory allocator available such as dlmalloc (a general purpose allocator), ptmalloc2 (used by glibc), jemalloc (FreeBSD/Firefox), tcmalloc (Google), libumem (Solaris), … Today I will talk about glibc malloc or ptmalloc2.

 

System calls

When using malloc (to allocate memory), the program will do one of these two system calls: brk OR mmap.

  • brk

This syscall will obtain memory from the kernel (which is not zero-filled) by increasing the program break location (aka the end of the data segment). Initially, the start and the end (start_brk and brk) of the heap segment will point to the same location. If ASLR is turned off, the segment would point to the end of the data/bss segment (end_data), else it would point to end_data + a random brk offset.

(Image taken from sploitfun.wordpress.com)

  • mmap

The mmap syscall will be used to create a “private anonymous mapping segment” which purpose is to allocate new memory (zero-filled) which will be used only by the calling process. From personal experience, I also know that mmap can be used to remap memory section to be writable or executable for example.

 

dlmalloc vs ptmalloc2

During the early days of linux, dlmalloc was used rather than ptmalloc2 but ptmalloc2 gained popularity due to its threading capabilities. In ptmalloc2, when malloc is called at the same time by two threads, memory can be allocated immediately because each threads will maintain a separate heap segment and separate freelist data structures or bins (this separation is called “per thread arena”).

The first allocation will use a minimum size of 132KB and this region of heap memory will be called the “main arena” if created inside the main thread. This region will be used until it runs out of free space. When the arena runs out of free space, it can grow again by increasing the break location. Arena can also shrink if there is a lot of free space on top chunk.

When a free is performed on an allocated memory region, this memory won’t be released to the OS immediately but will be put to the main arenas bin to be used at a later time. When an user will request memory later on, this memory section put inside a bin will be reused if possible. Memory from kernel will be obtained only when no free blocks exist.

When creating a malloc inside a thread, a heap memory of size 1MB will be created but only 132KB will be read-writable, this section will be called the “thread arena”. This memory section will use mmap rather than sbrk(?).

When an user request a size of more than 128KB and when there is not enough place in an arena to satisfy the request, memory will be allocated using mmap (and not sbrk).

 

Arena

The maximum number of arenas depends on the core numbers (32bit: core * 2, 64bit: core * 8). When a program have more thread than arena, multiple arenas will be shared among the threads. For example, if a third thread is created in a 1-core system, this thread will try to reuse either the main arena, arena 1 or arena 2. If malloc is reused in thread 3, it will try to re-use the last accessed arena. If that arena is free, it will be used, else thread 3 will be locked until this arena gets freed.

 

What did I learn today?

  • System calls used to malloc (brk or mmap)
  • How does brk work (moving the break)
  • How does mmap work (creating a private anonymous mapping segment)
  • Difference dlmalloc & ptmalloc2
  • Threads and their arenas and how they work
  • First allocation (132KB)

 

Leave a Reply

Your email address will not be published. Required fields are marked *