Hey, today I’ll go a little bit further into my introduction to heap and hopefully finish the allocation subject I started. Last time I was able to understand how an allocation work roughly (syscall usage, arena creation and distribution depending on threads), let’s continue.
By the way, since it is the 32nd article, I think we can officially say we did get past the first-month threshold!
Multiple data structures can be found inside glibc, here’s a quick list:
- heap_info (or the Heap Header) – Basically, one thread arena can have multiple heaps if the first heap segment runs out of space. The new heap segment (non-contiguous) will get mmap’d to the arena. This structure won’t be present in the main arena due to the fact that the main arena only has one heap structure that gets extended (contiguous segment) with sbrk.
- malloc_state (or the Arena Header) – This structure will hold information about the arena itself, such as the bins, the top chunk, “last remainder chunk”, … The arena header for the main arena is contained in libc.so data segment because it is a global variable while the arena header for thread arenas will be handled in the sbrk’d heap segment.
- malloc_chunk (or the Chunk Header) – Every time a user create a memory request, it will create a chunk. One heap can handle multiple chunks.
Another image taken from sploitfun.wordpress.com
And if the thread arena contains multiple heaps:
Image taken from sploitfun.wordpress.com
What’s interesting about these graphs is that we can see that the free’d chunks still keep their malloc_chunk structure and don’t get coalesced, there’s also a pointer to the top of the Arena inside the Arena Header and a “prev” pointer inside the second memory mapping segment if a Thread arena contains multiple heaps. We can also see that the top chunk have a malloc_chunk structure as well and that the top chunk of a thread arena which expanded to two memory mapping segment will become a free chunk.
Chunks and their types
As said previously, a chunk is a memory segment present in a heap when a user will request memory, there are several chunk types:
- Allocated chunk
For allocated chunks, the malloc_chunk structure will hold this information:
[prev size OR previous chunk user data][size + last 3 bits containing flag information N M P]
The previous size will be specified only if the previous chunk is free’d, else, this field will contain the previous chunk’s “user data”. After that, there’s the size of the chunk’s data with the last three bits containing different flag (P or PREV_INUSE, set when the previous chunk is allocated, M or IS_MMAPED, set when the chunk is mmap’d, N or NON_MAIN_ARENA, set if it belongs to a thread arena). After this structure the chunk will contain the data itself.
There are normally other fields in malloc_chunk such as fd or bk but these will not be used for an allocated chunk and this space will be used for user data. The size requested by the user will be converted to a “usable size”, which is containing the malloc_chunk structure as well as space for alignment purpose, the conversion will also take into consideration the fact that the last three bits won’t be used and will be used to store the flag information.
- Free chunk
For free’d chunks, the malloc_chunk structure will hold this information:
[user data of previous chunk][size + last 3 bits containing flag information N M P][fd][bk][Unused space]
Two free’d chunk can’t be adjacent, because of that, the prev_size inside a malloc_chunk of a free chunk will always contain user data. If two adjacent chunks get freed it will coalesce into one single free chunk.
The size will contain the size of the free chunk obviously, the 3 bits are the same as for the allocated chunk. The fd (forward) pointer will be used to point to the next chunk in the same bin (NOT the next chunk in the physical memory), the bk (backward) pointer will be used for the previous chunk in the same bin.
- Top chunk
The chunk will be at the top border of an arena and doesn’t belong to any bin. This chunk will be used when there are not any available free blocks in any of the bins.
If the user requested memory’s size is less than the top chunk size, it will be split in two section: the user chunk (len(top chunk) – len(requested)) and the remainder chunk (len(top chunk) – len(user chunk)). The remainder chunk will then become the next top chunk. If the top chunk is too small for the requested size, it will be extended, either using sbrk (if we are in the main arena) or mmap (in a thread arena).
- Last remainder chunk
It is the remainder of a split due to a small request. If a user request a small size and it cannot be served by a small bin and unsorted bin, “binmaps” will be scanned to find the next largest (non-empty) bin which will be split in two, the user requested chunk which will be returned and the remainder chunk that will be put into the unsorted bin and become the “last remainder chunk”.
This chunk will be used when consecutive malloc of small sizes are requested for example.
Bins are the freelist datastructures of ptmalloc2 and are used to hold free chunks, different bins are available depending on chunk sizes and two datastructures are used: “fastbinsY” for fast bins and “bins” for unsorted bins (bin 1), small bin (bin 2 to 63) and large bin (bin 64 to 126).
- Fast bin
Between size of 16 to 80 bytes, chunks are called fast chunks and the bins holding them are the fast bins. Fast bins are generally faster in allocation/dealloc.
There are 10 fast bins which all contain a single linked list (binlist) of free chunks. The list is single linked since removal are not done in the middle but rather, the addition or deletion are done at the front end of the list (LIFO). The size differences between fast bins are 8 bytes, meaning the fastbin #0 will hold chunks of size 16, the #1 will hold chunks of size 24 and so on. All chunks inside a fastbin are of the same size.
At the initialization of malloc, the maximum fast bin size is set to 64 (and not 80?) bytes. By default chunks of size 16 to 64 are categorized as fast chunks.
There is no coalescing inside the fast bins and two chunks can be free’d can be adjacent to each other without being combined into one single free chunk, this is there for speed purposes.
During a malloc of the first fast chunk, the “fast bin max size” and the “fast bin indices” are empty first and even though the user requested a fast chunk, the small bin code will try to service it first. When it is not empty anymore the fast bin index will be calculated to retrieve the corresponding binlist and the first chunk from the binlist will be removed and returned to the user.
When a free is done, the fast bin index is calculated is calculated to retrieve the corresponding binlist in which the free chunk gets added at the front position.
Image taken from sploitfun.wordpress.com
In this image we can see the main arena’s fastbinsY structure in red containing different pointers to the linked lists. The first list being of size 16 bytes (fastbin #0), the second one of size 32 (fastbin #2) with the structure we described in the free chunk section. We can see that the bk is not implemented since it is a single linked list.
- Unsorted bin
When small or large chunks gets freed, they’ll be put in the unsorted bin so that glibc malloc could re-use them if necessary (once again, for speed purposes, because the time used to check for the appropriate bin is eliminated).
There is only one unsorted bin that contains a binlist as a circular double linked list and as said previously, there is not size restriction in this bin.
Image taken from sploitfun.wordpress.com
We can see there that since it is a circular double-linked list, rather than holding a pointer to the front of the list, there is two pointer fd and bk, respectively pointing to the front and end of the list, same thing for the small bins and the large bins I’ll talk about. Inside the linked list the bk of the first chunk and the fd of the last chunk will point to the bin.
- Small bin
Chunks of size less than 512 bytes are called small chunks and are stored in small bins. They are faster than large bins but slower than fast bins.
There are 62 small bins which are all using a circular double linked binlist since there could be a chunk unlinked in the middle of the list. The addition will happen at the front of the list and deletion at the end (FIFO).
The difference in small bins are of 8 bytes, the first small bin (Bin 2) contains bins of size 16 bytes, second one (bin 3) will hold bins of size 24 bytes and so on. Chunks inside a small bins are of the same size.
At the first call to malloc, small bins would be empty so it will try to be serviced by the unsorted bin, it will then initialize the malloc_state by pointing bins to themselves, signifying they are empty. Once it is not longer empty, the last chunk is removed and returned to the user.
When a free occurs on a small chunk, it will check if the previous or next chunk is free and will try to coalesce the chunks (by unlinking both chunks from their respective bins, consolidate them and link the new chunk to the beginning of the unsorted bin).
- Large bin
Chunks of size superior or equal to 512 bytes are called large chunks and are stored in large bins. They are slower than small bins in allocation and deallocation.
There are 63 large bins using a circular double linked binlist because in large bins, additions and removal are made from any position (front, middle or rear).
The difference in large bins vary, there are 32 bins which are 64 bytes apart (Bin 65 holds sizes between 512 to 568, bin 66 holds sizes between 576 to 632 and so on…), 16 bins which are 512 bytes apart, 8 bins which are 4096 bytes apart, 4 bins which are 32768 bytes apart, 2 bins which are 262144 bytes apart and finally, the last bin which is a chunk of the remaining size.
Unlike fast bins or small bins, chunks inside a large bins are not of the same size and are stored in decreasing order, the largest chunk is stored at the front end while the smallest is at the rear end.
Two free chunks cannot be adjacent and will be coalesced.
At the first call to malloc, all large bins would be NULL, hence the requested chunk will be serviced by the next largest bin code, similar to small bins, it will initialize malloc_state by pointing the small bins and large bins datastructures to themselves, signifying they are empty. When it is not empty, if the largest chunk size (in a binlist) is greater in size than the user requested chunk, binlist will be walked from the rear to the front to find a suitable chunk (near or equal in size) then it will split the adequate chunk in two: the user chunk of the requested size, returned to the user and the remainder chunk sent to the unsorted bin.
If the largest chunk size in a bin is smaller than the user requested size, it will go to the next largest non-empty bin (if found, it will split it then return it to the user, else it will try to serve the request using the top chunk).
The free procedure is similar to the small bin.
What did I learn today?
Phew, first of all, thank you for reading if you managed to get to this point. That is a big annoying chapter about learning heap exploitation we just finished. I always did push that learning to another day, and even though I’m not sure I’ll remember everything I’ve written by heart, I did understand it and I’ll be able to get back to this article if I have a doubt. Here’s what I learned:
- The different datastructures present in glibc ptmalloc2
- The difference in extension between the main arena and the thread arena
- The last remainder chunk (might need clarification)
- The previous chunk user_data when the previous chunk is allocated
- Refreshed my memory about the chunks and bins types
- Learned exactly how the bins get chosen and how a free chunk gets splitting
- How does a coalescence happen (unlink, coalesce then add to the unsorted bin)