Testing Google Skipfish
A first impression of Google's Skipfish scanner for web applications
by Felix 'FX' Lindner, founder of Recurity Labs
According to a Google security blog post by developer Michal Zalewski, Google's new, free Skipfish scanner is designed to be fast and easy to use while incorporating the latest in cutting-edge security logic. Felix 'FX' Lindner examines Skipfish to see how well it compares to other tools used to check web site integrity.
When checking the security of web applications, developers use automated scanners to gain an overview which can then be refined by further manual testing. Depending on the user's requirements and expertise, this may involve minimalist basic tools such as Nikto, or comprehensive commercial software such as Rational AppScan. The main aim of such a scan is to reveal configuration and implementation flaws in the interaction between web servers, application servers, application logics and other components of modern web applications. Typical vulnerabilities detected this way include SQL injection, cross site scripting, cross site request forgery and code injection.
The curtain goes up
Skipfish, released in March, supports all the features one could wish for when doing a generic web page scan: It can handle cookies, it can process authentication details and the values of HTML form variables, and it can even use a single session token to navigate the target page as an authenticated user. One of Skipfish's specialities is to run through all the possible file and directory names in order to detect items like script back-ups the admin has forgotten about, compressed archives of entire web applications, or SSH/subversion configuration files that may have accidentally been left behind. As such items can only be tracked down by trial and error, the scanner combines various known file extensions with all the file names it detects by checking the normal links on the web page.
It also tries out a few hand-picked keywords (probably extracted from Google's search index) as directory and file names. Skipfish is especially noteworthy in this respect because it actually establishes and checks all possible combinations. This means that every keyword will be combined with every file extension and every actual file found on the web server, and that the result will be tested as a file name, a directory name, and as an argument for a HTTP POST request. This approach generates a very large number of combinations which could prove overwhelming. Thankfully, Skipfish provides predefined dictionaries of varying sizes, allowing users to determine the extent of the request flood generated. However, even the minimal starter dictionary recommended by Zalewski includes 2,007 entries and produces about 42,000 requests for each directory tested.
To gain an impression of Skipfish, we briefly investigated a few typical scenarios: We used a Microsoft Internet Information Server 7.5 under Windows 7 with a ScrewTurn Wiki 3.0.2 for ASP.NET as our typical interactive web application. A Linux-based Apache 2.2.3 server with traditional CGI scripts written in Perl represented the more dated approach. An old HP LaserJet printer with ChaiServer 3.0 was used as a typical cause of false alarms and scanning problems.
Skipfish is only available as source code and must be compiled for the target platform. This requires the libidn library for encoding and decoding Internationalised Domain Names (IDN) to be installed – for example, under Ubuntu 9.10 this is easily done by running sudo apt-get install libidn11-dev in a terminal window. After successfully compiling the source code, users select a dictionary from the dictionary subdirectory and copy it to the directory which contains the Skipfish binary as skipfish.wl. Alternatively, the path can also be added as an option. It should be noted that Skipfish will automatically extend this dictionary file, so it is always advisable to use a copy.
./skipfish -o example-log http://www.example.com/ starts the server scan and places a scan report in the example-log directory after the scan is completed. To only check the "blog" subdirectory on the server, enter "./skipfish -I /blog -o blog-report http://www.example.com/blog".
Skipfish produces an enormous number of requests which it processes simultaneously at an impressive speed. During the first run in a local Gigabit Ethernet segment against a sufficiently scaled system, it achieved 2796 HTTP requests per second. This immediately pushed the load of one of the scanning system's CPU cores to 100 per cent. Although the Linux TCP/IP stack can handle this load it needs to utilise all available means to do so, as soon becomes evident when looking at the netstat data returned.
Once a scan has been started there is no indication of how long Skipfish will actually take to complete it. In our test, we had to wait for more than four hours for the scanner to complete its check of the neighbouring IIS 7.5 with the ScrewTurn wiki. In the process, the scanner transferred 68GB of data to send 40 million HTTP requests. One of the scanning system's CPU cores was working to full capacity almost all of the time, and Skipfish used 140MB of working memory when combing through the wiki's moderate 1,195 files in 43 directories. At the same time, the IIS used the full capacity of two Intel XEON 2.5GHz cores as well as a further 100MB to 500MB of working memory, producing a log file of 1.5GB in its default configuration.