Egurra - A dumb file format fuzzer

Although this tool isn't brand new and has been lurking in my archive for a couple of months now. I've finally got hand of a time window to release it publicly. A couple of months ago, when I was taking a look at Gray Hat Python [1], I though to build my own dumb file format fuzzer as an exercise. The following shows and shares the results of my ramblings. The tool in question is called Egurra, which in basque stands for Wood, and comes from the also basque expression "egurra eman", which kinda translates into "give it the beans", which in turn is the brute-forcish essence of fuzzing. The fuzzer is comprised by 3 components that define the following architecture:

Fuzzer architecture.

The main user interface depicted in the diagram as Main, is the main terminal interface from where the user interacts with the fuzzer and initializes the engine. The fuzzing engine labeled Enginein the diagram, is the first of the two main processes that do the actual work. This component takes care of spawning, monitoring and killing the launched processes. The last component labeled Mutatortakes care of picking a random file from the available sample pool, and mutating it randomly to later be fed to the processes that the Engine is spawning. Egurrauses a directory structure to randomly pick samples from, store crashes, and store temporary mutated files. Sample files are stored in the "/examples" directory and the generated crashes are stored in the "/crashes" directory. For the latter, two files are generated in each crash: One file that stores the input file that generated the crash, and another one with some processor and memory dumps that aim to aid in debugging the exception. The temporarily generated files are stored in the root directory and get smashed in each iteration. As you might have already guessed all the code is written in Python and makes extensive use of the PyDbglibrary to launch and monitor processes. Most of the code is borrowed from Gray Hat Python (I told you it was an exercise :-) and from kind code donations by matalaz [2] (cheers buddy!). The algorithm used to mutate the randomly picked samples is an adaptation from Charlie Miller's talkBabysitting an Army of Monkeys[3]. Here's a little snippet:
moldatzeko = random.randrange(math.ceil(float(edk_len)/factor))+1
for i in range(moldatzeko):
	kar = random.randrange(256)
	kar_num = random.randrange(edk_len)
	edk2 = edk[0:kar_num-1]
	edk2 = edk2 + "%c" % (kar)
	edk2 = edk2 + edk[kar_num:edk_len]
	edk = edk2In this piece of code,
edk_len stands for the length in bytes of the chosen file. factorstands for a value known as fuzz factor, which is a value that is used to compute moldatzeko, which is the amount of bytes that will be mutated in the input file. From then on, it's a simple for iterative sequence that mutates random bytes for random values. Since the design is very distant from the actual structure of the file format being fuzzed, it's pretty easy to use. All you need to do is build a good sample pool, which by itself is one of the key factors of a successful fuzzing session. Code Coverage can be a interesting topic for another post but we will leave that aside for the rest of the present one. Once the pool is set up, we launch the fuzzer with the following command line:
python egurra.py timeout executable-path file-extension fuzz-factor
From then on it's a matter of luck and time to pop some interesting crashes. As the iterations go by, the fuzzer will print the number of iterations completed and the file that was picked for mutation during that iteration. So there you go, a simple, rough tool to hunt for possible file format vulnerabilities. Download is available here. References: [1] - Justin Seitz. Gray Hat Python. http://nostarch.com/ghpython.htm [2] - Joxean Koret. http://joxeankoret.com/blog/ [3] - Charlie Miller.Babysitting an Army of Monkeys: An Analysis of Fuzzing 4 Products with 5 Lines of Python. http://securityevaluators.com/content/news/index.jsp?topic=cansecwest_2010