The second series of "CSI:Internet" was originally published in c't magazine starting with issue 15/2011. For links to articles in the first series please refer to our CSI:Internet HQ - Series 1 page.
Episode 4: Open heart surgery
by Frank Boldewin
It's one of those rare Saturday afternoons when the sun is actually shining. I'm just wondering how much meat to buy for the barbecue when my mobile rings. It's Hans – he nervously confesses to me that he thinks he's caught himself a virus.
As I innocently enquire how that might have happened, my normally supremely confident mechanical engineering student friend breaks into a stutter, "Well, the thing is, I, well yesterday I bought a new computer, with Microsoft Office pre-installed. But it was only a 30 day test version, and I don't have the money for the full version. So I thought…" I finish his sentence for him, "I thought I'd just download a hacked version and save myself a bit of cash. And now strange things have started happening. Right?"
"Yeah, exactly, how did you know that?" asks Hans. "Because there's one born every minute", I think to myself. "Because you're not the first person to call me up with this kind of problem", I tell him.
I draw out from him the full chain of events. "After launching what was supposed to be a hacked Office version, as if by magic the executable vanished from my desktop. Other than that, nothing happened, or at least nothing I could see. But since then, my router has been signalling almost constant internet traffic, even though no applications which should be generating traffic are running. After restarting, everything looked OK for a little while, but then the router LEDs started blinking away again." Since I owe him one, I jump on my bike and head on over to Hans' place.
A quick look at the system and I already have a hunch that I'm not going to find anything superficial, and will need to drill down deeper. It's too bad that I've left my memory dump analysis system in the office. So I'm going to have to analyse the computer directly. Local kernel debugging – it's like open heart surgery.
The phrase "No risk, no fun!" runs through my brain while I roll my metaphorical sleeves up. Less metaphorically, I install Microsoft's Debugging Tools for Windows from my write-protected USB flash drive – principally for its excellent WinDbg debugger. It requires .NET, but that's already installed.
Normally the debugger would be run on a separate analysis system and would control the computer running the code we want to analyse via a serial cable or FireWire. In the absence of a second computer, I'm going to have to debug locally. But then what are tools like Mark Russinovich's LiveKd for? I sling it into the WinDbg installation directory and also copy Moonsols' useful callbacks.wdbg script for WinDbg into the scripts\ subdirectory.
We're ready to roll. I launch LiveKd with the argument
–w, which calls WinDbg. LiveKd starts by asking me whether I want to download the current file symbols from the Microsoft Symbol Server at http://msdl.microsoft.com/download/symbols. You bet I do – this is what tells the debugger the addresses for all of the Windows functions and data structures. Since this varies between Windows versions, languages, service packs and even individual updates, the right symbols for the particular system are always required.
Firstly, I'd like to find out whether the malware has embedded itself within the system and, if it has, how deep it's embedded. This is where callbacks.wdbg can help me. Various events can cause the Windows kernel to activate callback handlers. Creating a new process, for example, triggers an event which you can register with the kernel as PspCreateProcessNotifyRoutine.
Pretty much every sophisticated kernel mode rootkit I've come across over the last few years uses one of these options to hook itself into process launches. This allows it to deactivate security software on loading, or to inject user mode components into trusted Windows processes such as svchost, winlogon and services.exe.
So I launch the callback script from the kernel debugger prompt. It runs through the events in sequence and lists all of the handlers registered for these events:
OK, so the KD syntax takes a little getting used to. Once you get the hang of it though it's the most powerful tool in the Windows ecosystem. I ignore the error messages for a few symbols which KD is unable to resolve and concentrate on the callbacks found. None of the callbacks for
Create Thread or
BugCheckCalls point to known Windows functions in the HAL or the NDIS or SMB drivers. Nothing to see here. But what does pull me up short is an entry under
load image notify callback – its address,
0x820d78eb, apparently points into completely uncharted territory.
This calls for a second look using the KD command
!address 0x820d78eb. The debugger confirms my fears:
813ed000 - 01213000
The kernel's nonpaged pool usually contains data which is always directly associated with physical memory, i.e. is never swapped out to the hard drive. This is where the kernel stores data structures which still need to be available when a page fault while reading data from the hard drive would prove fatal. It's not a good idea to start additional read operations when handling a hard drive controller hardware interrupt.
The data stored here contains information on processes, semophores, mutexes, etc. Memory from the nonpaged pool is almost never used for code. I have only ever seen code in this area when a rootkit has been involved, most recently with a piece of malware called TDL. Could I be looking at a TDL variant? If I am, then Hans has managed to infect his brand new system with a pretty advanced kernel rootkit.
An increasing number of rootkits, including a number of TDL variants, use a special technique to get their code executed. Windows maintains a pool of what are known as system worker threads, launched by the system process during boot. These are intended to take work off the hands of other threads, such as threads for handling interrupts. This is done for purposes such as vacating an area of code which locks important system resources whilst executing as quickly as possible, or just to improve the stability of key kernel components.