An almost realistic scenario
Let's now take a look at a simple sample application that uses SimpleHeap to see how attackers can have their own code executed by these means. In this example, we will be using an image processing program for files in a very simplified graphics format.
Our sample image file starts with four bytes for each of the width and the height of the image, followed by one byte per pixel of colour information. The program that the user launches to process the image on his system reads the information about width and height from the file provided by the attacker and reserves the necessary amount of space on SimpleHeap to save the image data for further processing.
In addition, the program reserves memory on SimpleHeap according to the width information, so that individual lines can be read out of the image file, processed, and then written into the large memory block reserved for all of the image data.
The following pseudo-listing illustrates the individual steps:
1 Read "width"
2 Read "height"
3 Allocate memory labelled "image" of the size "width * height"
4 Loop, for each line in the image ( 0 <= i < height ) :
<li>Allocate memory labelled "line" of size "width"</li>
<li>Read "width" bytes of data into "line"</li>
<li>Store "line" in memory area "image" at position "width * i"</li>
<li>Release memory labelled "line"</li>
10 End loop
Let us give the application programmer the benefit of the doubt and assume that he thought about how to make sure that his program can work properly even if it is presented with corrupt image files. For instance, it only reads the amount of data described in the header for the image and then places it into the buffer space reserved for this purpose. Nonetheless, our programmer can commit a grave mistake. In the calculation for the memory space needed for all of the image data, we find:
image = (unsigned char *) SimpleHeap_alloc(width * height, myRoot);
What at first seems to be a proper implementation of what is needed, happens to produce a a fatal error with the following picture information:
Width = 0x00000080 = 256
Height = 0x10000000 = 268435456
The function SimpleHeap_alloc() expects size_t to be the first argument, and this type of data is exactly 32 bits wide in a conventional 32-bit program. As a consequence, only the lower 32 bits of the result of the multiplication are transferred to the function. Here, this is the eight zeros from the series 0x800000000.
The important information -- the 8 in the most significant byte -- is lost, and the program reserves 0 bytes of memory space for the image file. Technically, this operation is correct, and SimpleHeap is doing its job properly in executing it. Unfortunately, 0 bytes is not enough to save the subsequent image data. The result is a heap overflow, here as a result of an integer overflow during the calculation of size.
If the program opens an image with this size information, merely saving the image data will overwrite the memory allocated for "line", which is right after "image" on the heap.
00: 80 00 00 00 00 00 00 10 41 41 41 41 41 41 41 41 ........AAAAAAAA
10: 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
20: 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
30: 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA
The values for height and width are, incidentally, in reverse order in the file because Intel CPUs save data in the little-endian format starting with the least significant byte. The sample program crashes when it starts to edit the image as soon as it reaches line 9 in the first loop, with the SimpleHeap_free function causing a memory protection violation when the address 0x4141414d is read in the test:
if ( 0 == hdr->next->used )
What happened, and more importantly, where did that "d" come from? The A characters overwrote the line block's header in line 8. Line 9 then calls SimpleHeapfree () to release line again. While the variable hdr still refers to the right place in memory, the image data has overwritten the content.
As a result, the expression hdr->next has the value 0x41414141, which is equivalent to four uppercase ASCII As (0x41). The element used in a SimpleHeap header is at offset 12 from the beginning of the header, behind the 4 bytes each for next, prev and size. This results in the following address calculation:
0x41414141 + 0x0c = 0x4141414d
If we now try to read the value of used, a memory access is attempted at this address, to which no memory is assigned, producing a memory protection violation. Thus, it is clear that attackers can control the heap and overwrite it with their own data. In addition, after this test they also know that image data can overwrite the header of the following block.