Virus scan with Google and MSN
In addition to HTML sites and PDF files, Google now also indexes EXE files. The search engine analyzes the PE headers introduced with Windows NT for these executable files and displays the details. As with PDF files, Google displays the hits as "HTML versions" with links when you search for "Signature: 00004550", which indicates that an executable file will run on Windows NT. MSN also offers a PE analysis and presents the hits in a similar fashion.
In the past, operators of search engines have tried to get as much data as possible out of the net for later analysis. In an experiment, security firm Websense demonstrated how the data from executable files can be used: they used the search engine Google to find contaminants on web sites. Researchers at the company found thousands of contaminants on the World Wide Web.
To do so, they programmed an application that uses Google's programming interface for binary search queries. In addition, they not only looked for the Windows NT signature, but also for signatures indicative of malicious software. The analyst found numerous Trojans, variants of the Bagel and Mytob worm, and various other malicious files in newsgroups, forms, personal home pages, servers at educational institutions, compromised web sites, and underground sites.
Websense believes its findings confirm that web sites are increasingly being used to store and distribute malicious software. While security experts can use Google to find malicious software, authors of such software can also use this function. For instance, they could insert ads into the contaminants as strings to draw attention to their malicious software. After all, virus creators are increasingly trying to sell their products.
As Dan Hubbard, Senior Director of Security and Technology Research at Websense, told heise Security, "In our investigations, we found thousands of web sites that spread malicious software. We are now processing a number of different search queries every day, and in the meantime we have found more than 2,500 malicious files."
He stated that Google's binary search was used in this experiment. "For years, we have been using text-based searches along with our own crawlers to study 80 million sites per day. Drive-by downloads are common. The number of sites that try to install malicious software via drive-by downloads is estimated in the hundreds or thousands."
Hubbard added that "we are only using these findings for our in-house research, mainly to add protective mechanisms to our products. Informing all of the operators of web sites is not easy, but we do notify major brand vendors."
Also experimenting in the same field is H.D. Moore, known among other things as the initiator of the Metasploit project and the Month of Browser Bugs campaign. He has fed his malware search engine with the unique signatures of around 300 different contaminants and plans to increase the number of PE signatures to 6000.
In his experiments, Moore found far fewer contaminants in the links than Websense's figures would seem to indicate. For example, a sample search for Bagle variants only found 23 such files behind individual signatures.
- Mining for malcode with Google, entry at the Websense blog