The excitement builds up when I test-drive the first halfway useful version of my OfficeMalScanner on the PowerPoint file. Take a look at that – the tool works as planned and shows me a total of five streams behind the PowerPoint document:
Pictures [TYPE: Stream - OFFSET: 0x200 - LEN: 857541]
CurrentUser [TYPE: Stream - OFFSET: 0xe3600 - LEN: 47]
SummaryInformation [TYPE: Stream - OFFSET: 0xd7800 - LEN: 44484]
PowerPointDocument [TYPE: Stream - OFFSET: 0xd1800 - LEN: 46958]
DocumentSummaryInformation [TYPE: Stream - OFFSET: 0xd7800 - LEN: 912]
On the other hand, it all looks pretty harmless up to now. In particular, the scanner did not find a VB macro, even though I took pains to include a special detection routine that would have decompressed any Visual Basic script code into a separate file. But I'll find it next time.
The PowerPoint file obviously uses more refined tricks to execute code. It probably uses one of the numerous known and long-patched vulnerabilities in PowerPoint to inject and execute arbitrary code directly. But if so, I should be able to find the problem in the Office file if I know what to look for. So I sit down to look a bit deeper with a new scan routine.
A couple of hours later, the new version is finished. The new tool recognises hidden, executable files in Office files by their PE header – the "MZ" and the rest that you see when you open an EXE file in an editor. I recognise an embedded Windows Object file in the OLE format by the typical Office binary-format signature
In addition, I have also put together a number of signatures to detect typical shellcode elements. For instance, the tool reports the combination of push and call commands typical of function calls, and a number of more complex (and therefore more reliable) signatures are also already included. The one below, for example, is a classic in Windows shellcode programming and is still used in exactly this way in a number of exploits:
mov eax, fs:[30h]
mov eax, [eax+0Ch]
mov esi, [eax+1Ch]
mov ebp, [eax+08h]
If you inject code, you generally do so in order to subsequently download files from the internet, write them onto the hard drive, and launch them. There are a number of functions in the system for this purpose, but you have to find them first in order to call them. Back in 2002, the legendary hacker group that called itself "Last Stage of Delirium" proposed this approach to reliably find the basic address of the central system library kernel32.dll in memory. It provides a number of useful system functions, such as
LoadLibraryA to download further libraries.
The code itself is quite straightforward. At the memory address
FS:0x30 for an active Windows process, there is always a pointer to what is called the Process Environment Block (PEB) containing linked lists of modules already loaded, among other things. The code works through those lists until it reaches the desired basic address for kernel32.dll in the ebp registry. The breakthrough paper entitled Understanding Windows Shellcode and a number of other sources provide the main machine code sequences for shellcode interpretation.
But back to the PowerPoint file. I run the test with the new scan mode in my OfficeMalScanner and quickly have some interesting findings: an FS:30 to find kernel32's basic address, an API hashing loop to detect certain functions, and a slew of push/call combinations:
FS:[30h] (method 1) signature found at offset: 0x506e
API-Hashing signature found at offset: 0x52fb
PUSH DWORD/CALL signature found at offset: 0x50ab
PUSH DWORD/CALL signature found at offset: 0x5137
Indeed, a quick check with the disassembler DisView.exe, which I also threw together hastily, clearly reveals shellcode starting at
0x506e. But I'm still not really satisfied.