Skip to content


Egurra: A dumb file format fuzzer

Although this tool isn’t brand new and has been lurking in my archive for a couple of months now. I’ve finally got hand of a time window to release it publicly. A couple of months ago, when I was taking a look at Gray Hat Python [1], I though to build my own dumb file format fuzzer as an exercise. The following shows and shares the results of my ramblings. The tool in question is called Egurra, which in basque stands for Wood, and comes from the also basque expression “egurra eman”, which kinda translates into “give it the beans”, which in turn is the brute-forcish essence of fuzzing.

The fuzzer is comprised by 3 components that define the following architecture:

Fuzzer architecture.

The main user interface depicted in the diagram as Main, is the main terminal interface from where the user interacts with the fuzzer and initializes the engine. The fuzzing engine labeled Engine in the diagram, is the first of the two main processes that do the actual work. This component takes care of spawning, monitoring and killing the launched processes. The last component labeled Mutator takes care of picking a random file from the available sample pool, and mutating it randomly to later be fed to the processes that the Engine is spawning.

Egurra uses a directory structure to randomly pick samples from, store crashes, and store temporary mutated files. Sample files are stored in the “/examples” directory and the generated crashes are stored in the “/crashes” directory. For the latter, two files are generated in each crash: One file that stores the input file that generated the crash, and another one with some processor and memory dumps that aim to aid in debugging the exception. The temporarily generated files are stored in the root directory and get smashed in each iteration.

As you might have already guessed all the code is written in Python and makes extensive use of the PyDbg library to launch and monitor processes. Most of the code is borrowed from Gray Hat Python (I told you it was an exercise :-) and from kind code donations by matalaz [2] (cheers buddy!). The algorithm used to mutate the randomly picked samples is an adaptation from Charlie Miller’s talk Babysitting an Army of Monkeys [3]. Here’s a little snippet:

moldatzeko = random.randrange(math.ceil(float(edk_len)/factor))+1
for i in range(moldatzeko):
	kar = random.randrange(256)
	kar_num = random.randrange(edk_len)
	edk2 = edk[0:kar_num-1]
	edk2 = edk2 + "%c" % (kar)
	edk2 = edk2 + edk[kar_num:edk_len]
	edk = edk2

In this piece of code, edk_len stands for the length in bytes of the chosen file. factor stands for a value known as fuzz factor, which is a value that is used to compute moldatzeko, which is the amount of bytes that will be mutated in the input file. From then on, it’s a simple for iterative sequence that mutates random bytes for random values.

Since the design is very distant from the actual structure of the file format being fuzzed, it’s pretty easy to use. All you need to do is build a good sample pool, which by itself is one of the key factors of a successful fuzzing session. Code Coverage can be a interesting topic for another post but we will leave that aside for the rest of the present one. Once the pool is set up, we launch the fuzzer with the following command line:

python egurra.py timeout executable-path file-extension fuzz-factor

From then on it’s a matter of luck and time to pop some interesting crashes. As the iterations go by, the fuzzer will print the number of iterations completed and the file that was picked for mutation during that iteration. So there you go, a simple, rough tool to hunt for possible file format vulnerabilities. Download is available here.

References:

[1] – Justin Seitz. Gray Hat Python. http://nostarch.com/ghpython.htm

[2] – Joxean Koret. http://joxeankoret.com/blog/

[3] – Charlie Miller. Babysitting an Army of Monkeys: An Analysis of Fuzzing 4 Products with 5 Lines of Python.http://securityevaluators.com/content/news/index.jsp?topic=cansecwest_2010

Posted in Books, Fuzzing.

Tagged with , , , .


9 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. jowjow says

    File “egurra.py”, line 34, in fuzzCallback
    idatz = open(“temp.%s” % self.ext,”w”)
    IOError: [Errno 13] Permission denied: ‘temp.jpg’

    this happens quite often after a short time.

  2. jon says

    Now that’s weird :(. It never happened to me and I’ve had it running for a couple-weeks-long sessions in different apps. Which platform are you running on? All my test and fuzz sessions were performed in a XP SP3 box.
    File permission issues popped into my mind when I read your comment, but that would actually fail in EVERY single iteration and not after a short time as you say.

  3. jowjow says

    XP sp3 in a vmware workstation virtual machine. The problem arises with both mspaint and some other things i tested on. Perhaps it’s some slight race where the app is not closed before the fuzzer tries to write to the file again? I considered saving the file under .ext instead to avoid this race. Also, the extension supplied on the command line seems to be unused when collecting a file from examples/, is that correct?

  4. jon says

    Your theory sound very likely there, race conditions might force you to increase the parameter to allow enough time for the virtualized OS and app to handle file handling. Regarding the other question, you’re right. The parameter just takes care of assigning a extension to the temporary file which is the one sent as input to the fuzzed app.
    Taking for instance a GIF file, mutating it and feeding it as a JPG file to MS Paint sounds like a yet more aggressive fuzzing approach. You might just have found a new path there!
    Thanks for the feedback, take care :)

  5. jowjow says

    I’ll mess around with the runtime, thanks for releasing the code, allowing me to not have to write it myself (gray hat python was good :). I translated some of the code to english / more meaningful variable names for clarity,

    file_content = open(fitx,"r").read()
    file_length = len(file_content)

    random.seed(time.time())
    numberofbytestoswap = random.randrange(math.ceil(float(file_length)/factor))+1
    print "[*] swapping %d bytes" % numberofbytestoswap
    for i in range(numberofbytestoswap):
    randomcharacter = random.randrange(256)
    # print "inserting this byte: %d" % randomcharacter
    whereinthefile = random.randrange(file_length)
    new_file_content = file_content[0:whereinthefile-1]
    new_file_content = new_file_content + "%c" % (randomcharacter)
    new_file_content = new_file_content + file_content[whereinthefile:file_length]
    file_content = new_file_content
    func(file_content)

  6. jon says

    Great submission :)
    If this goes further we can also set up a GoogleCode project or something for a more formal development approach.

  7. jowjow says

    This allowed it to run for a long time with the same runtime without any errors obviously,

    def fuzzCallback(self,buf):
    try:
    idatz = open("temp.%s" % self.ext,"w")
    idatz.write(buf)
    idatz.close()
    except IOError:
    print "[*] failed creating file this time, wasting time by reopening it"

  8. jon says

    Now that’s interesting. Thanks for the submission! :)

  9. jowjow says

    I made some other modifications as well, once I am done tweaking and start deploying it on my nodes I will drop you a copy.



Some HTML is OK

or, reply to this post via trackback.