Structure recognition

In association with heise online

In his presentation entitled "Black Ops 2006" at the Chaos Communications Congress in Berlin [5], Security expert Dan Kaminski described an approach for intelligent fuzzing that does without complicated block-based protocol analysis and nonetheless takes the data structure into account. The basic idea is to detect the structure of the data to be fuzzed by means of an algorithm so that the structure can largely be retained during fuzzing.

Kaminski used the Sequitur [6] compression algorithm, which separates input data into a grammar (the structure) and a dictionary (useful data). His proof-of-concept fuzzer, the CFG9000 (Context Free Grammar Fuzzer) [7], is not yet completely developed, but is nonetheless able to use this method effectively enough so that browsers more or less manage to display an HTML document whose code has been tossed around a bit. No knowledge of the HTML protocol is needed; the fuzzer works with any data.

Even without HTML block analysis, Context Free Grammar Fuzzing provides a page that still somehow looks like heise.de.

Not dead yet

A major challenge in fuzzing is the reliable determination of when a programming error occurs. If the program is running on a computer that the tester has direct access to, several options are available. One commonly used method is to hang a debugger, such as IDA Pro or Ollydbg for Windows or gdb for Linux, onto the current process. If an access violation occurs during fuzzing, the debugger will immediately latch onto it for further analysis.

Furthermore, if the load on the CPU or memory suddenly skyrockets, that might indicate that the fuzzer has found a programming flaw. Common operating system tools and such special test programs as Valgrind can be used to detect memory leaks. Often, tracing - the incremental logging of a program's flow - is not practical because it drastically slows down the programs being tested.

Without direct access to the computer on which the subjects of our investigation are running, these methods are completely inapplicable. In such black box fuzzing, network traffic may, however, allow some useful conclusions to be drawn. If the tester finds unusual reply packets or none at all, or if the service is unreachable for a long period, it may have crashed.

Some fuzzing frameworks have been specially developed for these purposes. For instance, Autodafe offers special functions that analyze the reactions of the programs under study to the input data they are presented with by means of debuggers and tracers. But this field of research is still in its infancy.

Practical problems

In addition, testers face other problems typically confronted in practice. Occasionally, critical programming flaws lie behind other minor flaws that cause the program to crash before the critical flaws cause any damage. Moreover, more or less harmless memory leaks in the programs can create a mountain of worthless data that quickly fills up the available RAM. As a result, you might have to reboot the system frequently. Graphical user interfaces, such as browsers with dialog windows that pop up, can also throw a spanner into the tester's cogs.

Ways of getting around GUIs, such as Apple's Applescript language, which allows some of the input from the keyboard and mouse to be automated, unfortunately do not cover enough functions and are not even available at all on other systems. In such cases, testers can only resort to an editor or debugger to remedy problems in the source code or - a more complicated option - patch the binary program.

Roadmap

In the past couple of years, fuzzing has become immensely popular, as the skyrocketing number of highly specialized fuzzing tools demonstrates. Some of the tools can even be used without technical expertise. Not surprisingly, on security-related mailing lists alleged "security advisories" have been popping up recently that clearly reveal that the author used fuzzing to detect the flaw reported. Announcements to the effect that "version X of program Y crashes when used to open a file containing Z" completely lack technical descriptions, leaving even security experts in the dark about the actual effects - and potential remedies.

At the moment, special fuzzing tools are only available for a few protocols and formats, and more research is needed in a number of areas. Although fuzzing is still a fledgling method, it is already highly successful. Recent approaches, such as Context Free Grammar Fuzzing, show that there is still a lot of room for new ideas and improvements. Nowadays, programming flaws are being discovered almost as quickly as they are produced. But developers are already overworked and can hardly keep up with the announcements of vulnerabilities that need patches, which does not exactly make the computer world safer.

As long as software firms do not directly make any money by ensuring the security of their own products, they will be loath to devote any additional capacity to remedying security vulnerabilities. But it would be a step in the right direction if companies would take advantage of their expertise and employ efficient fuzzing as an integral component of their quality assurance in development.

Fuzzing is a very powerful tool that experts can use to quickly find errors they never expected - no more, no less. An analysis of any vulnerabilities detected and a demonstration of their potential ramifications, such as in fully functional exploits, would still require some pretty clever thinking. (cr)