Aww, another month or even more has apparently passed just in front of my eyes. As some of you might have realized, the school time have already ended (something like two weeks ago), thus allowing me to carry out some more research and remember about this blog. I expect some more posts to be written in the very next days, hope this will succeed.

In this particular post, I would like to describe some curiosities I found inside the kernel32.dll (and KernelBase.dll in case of Windows 7 RC) and ntdll.dll default Windows libraries. Not only want I to share the ideas that occured to me during this small research, but also I would like to hear some new techniques of making use of what I found. Feel free to add new facts/ideas regarding this post, as I could overlook some obvious assumption or things like this. Remember this is not and shouldn’t be considered a thorough report. To make everything clear, the entire post covers the situation on x86 versions of Microsoft Windows systems.

Actually, I want to write about a few things, all of which are listed below:

  • Modyfing the initial main thread CONTEXT structure via static DllMain code
  • LoadLibrary/FreeLibrary APIs handling the module reference counter
  • Undocumented FreeLibrary(AndExitThread) functionality and practical ways of making use of it
  • Fooling module managment APIs by crafting the TEB structure

Despite I would like to publish one post covering all the points listed above, I’ll try to create a short series of posts, each describing a separate mechanism. This should both help me manage the information and avoid making the reader feel overwhelmed by the contents.

DllMain function and the initial process context

One of the most interesting things I’ve recently came across was a short post on Skywing’s blog, saying about a neither documented nor well-known fact about what the lpReserved DllMain parameter contains in reality. It isn’t my intention to rewrite his post, so I’ll just try to summarise the main point (although I encourage to read it first, if you haven’t done it yet). The author uncovers some very curious fact that have been documented very poorly by Microsoft – a mysterious lpReserved parameter of a standard DllMain module entry function:

BOOL WINAPI DllMain(
__in  HINSTANCE hinstDLL,
__in  DWORD fdwReason,
__in  LPVOID lpvReserved
);

Due to its type and name, one could assume the argument is not important to a programmer and is presumably set to NULL. However, the documentation gives a small sign that there’s something more about this value:

If fdwReason is DLL_PROCESS_ATTACH, lpvReserved is NULL for dynamic loads and non-NULL for static loads.

If fdwReason is DLL_PROCESS_DETACH, lpvReserved is NULL if FreeLibrary has been called or the DLL load failed and non-NULL if the process is terminating.

As the above quotation informs, our library is able to tell a static load from a dynamic one and execute proper piece of code in regard to the current situation. That is basically everything, when it comes to the information provided by Microsoft itself. As Skywing shows, yet there is more data passed to our routine during the process initialization – LPVOID lpReserved parameter can be successfully treated as PCONTEXT lpvContext as well! The CONTEXT structure being pointed to is the main thread processor context, set using the NtContinue system call after the PE loader finishes process initialization. The mechanism causing the parameter to be so have been already described, so I will only try to give some examples of how this little curiosity can be used in reality. On the other hand, one must remember that even though this ‘feature’ is confirmed to work on current Windows versions, it is not guaranteed to stay in newer systems forever.

Let’s take a look at some of the possible scenarios:

  • Unpackers altering the EntryPoint field

If an exe-packer developer took advantage of the fact that DllMain function has full control over the main thread’s initial context, it could be succesfully used to create a special unpacking library. After compressing the main Portable Executable file, one of its import records would be altered so as to contain a reference to unpack.dll. The “additional” module’s goal would be to dynamically allocate a small piece of memory, copy the loader’s code there and change lpvContext->Eip to make it point to the new code. After the load stub generated by unpack.dll finished its execution, every single memory page allocated by code related to the unpacker would be freed (including the unpack.dll image section). The purpose of such approach is to make the PE file remain as clean as possible (modifying the smallest number of header fields etc). The same goes to the in-memory application layout, that would be intended not to contain any additional memory blocks in comparison to the original process.

  • Choosing EntryPoint depending on the system version

Since the initial context contents clearly differ between various versions of Microsoft Windows, this fact could be used as an OS-detection technique. There is a number of possibilities of what particular fields could be taken into account, such as lpvContext->Eip (pointing somewhere inside kernel32.dll, additional execution layer before the EntryPoint) or the stack contents (pointed by lpvContext->Esp). If an exemplary application had problems with some specific Windows version, or some operations had to be handled in different ways, the developer could simply create a separate entry function for each system and assign context values characteristic for a given OS to appropriate EP routine.

  • Debug Registers modification

Playing with the Debug Register values could be used both as a software protection layer and an additional debugging functionality. It is possible to set up to 4 hardware breakpoints by making use of DRx. Having control over these fields could be used to take advantage of this debug functionality from the very beginning of application’s execution path. Despite the practical advantages, it can be also used in order to fool a reverser trying to analyze our code – I would probably have much trouble finding out how do the breakpoints magically appear at EntryPoint, unless I wouldn’t thorougly check the imports’ DllMain code first.

  • Creating an EntryPoint detour

Another security-related idea is to set-up a kind of “trampoline” code that would be executed just before OEP and return to the original address right after doing it’s job (accomplished by simply changing lpvContext->Eip). The additional code could do whatever the author wanted it to do – i.e. perform some additinal executable integrity checks (checksums and things) etc. I find this technique very effective because of a few reasons. Firstly, what I’m writing about here is poorly-documented and unknown to most of people. Secondly, the fact of any “external” code ran before EP seems to be almost invisible for those using debuggers like OllyDbg, which sets an initial breakpoint on the EntryPoint value taken from the PE header and waits for it to execute. If the context modification routine was decently obfuscated, one could have real difficulty dealing with such a trick.

  • Setting the TF bit in EFLags register with a SE handler

Yet one more anti-debugging trick, making use of the EFLags bit mask – particurarly Trap Flag, one of its members. This “trick” is old and well known by the community, though it can be quite a surprise when used in the context of initial flags value. Before executing the first program instruction, an exception of EXCEPTION_SINGLE_STEP type would be triggered and the execution would be passed to the already-installed handlers. The absence of an exception would indicate the presence of an active debugger consuming the exception.

  • Zeroing specific segment registers

Next technique I consider pretty universal in terms of possible applications. Having one of the segment register’s value set to NULL could be used for a great variety of purposes, such as monitoring the application’s memory references, code obfuscation through exceptions and more. The advantages of controlling the segment registers in protected mode environment is going to be covered in a separate post.

  • Passing information between two or more static modules

As Gynvael Coldwind suggested, the fact that all the libraries being statically loaded into a new process respectively operate on the same piece of memory, could be used in order to let the modules “communicate” with each other, however “passing information” sounds better for me. In order to do this, one could use the extisting main thread stack. I am curious about possible realistic scenarios making use of this idea, waiting for any interesting concepts : P

  • Running the application using ret-based programming

Last but not least, an idea that occured to me just a few minutes ago – return-oriented programming! This subject has already been thorougly documented by many undependant researchers (a few presentations on more or less popular conferences have been also held). If you want to get familiar with the technique basis, you are strongly advised to visit Generalizing Return-Oriented Programming to RISC and Return-Oriented Exploiting. In general, return oriented programming aims to create a fully executable code by linking together assembly code snippets ending with a “ret” instruction. The idea itself is just a way of doing things, thus it is not a generic solution for one, specific problem – rather a tool that can be used for various purposes. However, code of this type can mainly be seen in very advanced ret-to-libc exploits. In my case, I would like to use it as an obfuscation (anti-reversing) technique additinally supported by the DllMain trick.

The very first task is to generate the initial stack data, containing pre-generated return-oriented code components of separate “opcode” instructions – executable memory pointers together with their parameters. You can learn how to generate such code from the aforementioned papers. Having this code, the only remaining objective is to replace the lpvContext->Esp value with the dynamically-generated stack and make the lpvContext->Eip field point to a ret (0xc3) instruction. The last step could be as well simplified by setting Eip the top stack value (thus avoiding finding and executing the “ret” instruction). As soon as all the modifications are applied, the process should begin its execution right from the indirect code placed on the emulated stack. By using such approach, the coder can be sure that the original EntryPoint will never be reached, unless the return-oriented code decides to do so. This technique used together with the DllMain hack could also be used to develop something similar to a simple VM execution environment, with the bytecode (in the form of return-oriented code) placed on the stack. Even though performing dynamic analysis (debugging) on such code does not look like a hopeless task, I think that static analysis could be considered so.

In general, the above list presents every single thing I was able to think of so far.

What is more, I managed to prepare a small bonus – a very simple, exemplary KeygenMe application showing how some of the presented ideas work on a real computer. One note: the global protection scheme is much more important and should be more concentrated on than the application’s mechanics (CryptoAPI functions and so on). If you encounter any problems with the provided executable, please let me know (as for now, it has been confirmed to work on Windows XP SP3 and Windows Vista SP2). The package can be downloaded from HERE.

Furthermore, I would be very pleased to get informed about any other suggestions regarding this subject (if you find any kind of mistake, appropriate feedback will be also appreciated).

This is actually all for now, I will do my best to carry on writing posts about other interesting Windows internal stuff I often come across ; )

As a loyal standard Windows shell (explorer.exe) user I often encounter some problems with the number of opened Windows on one desktop. Since my current notebook hardly ever goes down, so does the user’s shell. After a few working evenings, I often have difficulty localizing the desired windows. Having something like 40-50 of them, it is usually a hard task to switch between internet browser, IDA, programming IDE, virtual machines, file manager and so on. The worst thing for me turned out to be looking for the TotalCommander window (being used the most frequently). A situation like this was obviously causing much of a time waste and consequently frustration.

I came up with a few available solutions, listed below:

  1. Having the taskbar items sorted at any time, thus making the current work state much clearer.
  2. Creating a set of system-wide hotkeys, each responsible for setting focus on the associated window or a group of windows.
  3. Start using some kind of Virtual Desktop software and reorganize the whole work environment.

All of them sound pretty good, in fact, and each is worth being described in detail. What is more, there is a great amount of free software designed just to help users with such problems. However, what everyone should already know is that the best solution is the made-by-myself one 😉

Although all of the ideas have their advantages and disadvantages as well as difficulty level, it’s not the subject of this post. This time, I would like to focus on one particular approach, listed as the 2nd, but in a little bit less complicated form. To be exact, I will show how to make ONE specific application perform some actions in response to a hotkey signal. As you could have really guessed, this application is Total Commander in my case. The hotkey we will use as the totalcmd-caller will be ALT+1.

What I eventually wanted to achieve using this slight hack was to:

  • Have an ALT+1 hotkey registered in the system.
  • Handle the incoming hotkey events in a loop and perform specific actions (set the focus on totalcmd’s window) in some cases.

Since I wanted to be able to use the key combination at any time, the message-loop would also have to be active all the time. To be honest, it’s not a good option for me to have one additional process running on my system, only to have one simple window event handled the right way. As we need only one thread to keep the event handling loop active, we can easily put it inside the affected process itself (totalcmd.exe). Thus, before beginning the real work, we have to ensure we’ve got an active thread running inside our target. This can be accomplished in a few ways (what a surprise!):

  • Spoof one of the program’s imported DLL files, redirect all of its exports to original functions and start a new thread in the DllMain routine.
  • Do the same thing at runtime: use CreateRemoteThread function to inject our own DLL module into the target’s address space and perform some actions inside DllMain.
  • Patch the application executable directly so as it creates the new event-loop thread and deals with the hotkeys. In this case no additional, external files are needed.

As having an external dll module code executed in the process context gives the biggest control over the execution track, I chose the first option. What makes it different from the second one is that we would need an additional ‘loader’ program to inject the dll into totalcmd, while dll-spoofing technique takes advantage of the fact that the attacker’s DLL gets loaded automatically by the Windows loader.

The next step is to choose the library to spoof. What should be noted is that it is possible to spoof the non-system DLLs, only. The list of modules being considered “system” can be found in registry:

HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\KnownDLLs

After comparing totalcmd imports with the list above, we can extract the library names we are able to spoof:

  • comctl32.dll
  • mpr.dll
  • winmm.dll
  • winspool.dll
  • version.dll (Vista only).

The “Vista-only” note means that the version.dll library has been one of system dlls until Windows XP, but has been removed from the KnownDLLs list in Windows Vista. As my intention was to make the hack work on Vista, and I didn’t know the difference from Windows XP, I chose the VERSION module as the spoofing target.

If we want to spoof a system dll with the one provided by us, the new library must have identical exports (their names as well as the oridinal numbers). As I aimed to use the MINGW package to compile this project, every dll-creation specifics are characteristic to dllwrap only. First of all, I created the version.def file containing the exports together with their forwarding targets. We are not interested in injecting our code in any of the exports, but just having the DllMain function called, thus all the exported functions are simple wrappers to the original ones. The following listing presents the final version.def file:

EXPORTS
GetFileVersionInfoA= myVersion.GetFileVersionInfoA     @1
GetFileVersionInfoExW= myVersion.GetFileVersionInfoExW     @2
GetFileVersionInfoSizeA= myVersion.GetFileVersionInfoSizeA     @3
GetFileVersionInfoSizeExW= myVersion.GetFileVersionInfoSizeExW     @4
GetFileVersionInfoSizeW= myVersion.GetFileVersionInfoSizeW     @5
GetFileVersionInfoW= myVersion.GetFileVersionInfoW     @6
VerFindFileA= myVersion.VerFindFileA     @7
VerFindFileW= myVersion.VerFindFileW     @8
VerInstallFileA= myVersion.VerInstallFileA     @9
VerInstallFileW= myVersion.VerInstallFileW    @10
VerLanguageNameA= myVersion.VerLanguageNameA    @11
VerLanguageNameW= myVersion.VerLanguageNameW    @12
VerQueryValueA= myVersion.VerQueryValueA    @13
VerQueryValueW= myVersion.VerQueryValueW    @14

As you can see, there are only 14 exported addresses, all of them pointing to their equivalements inside the original library – myVersion.dll. You can read more about DLL export forwarding in [1]. The last thing we need to build our fake version.dll file is the hotkey-handling code itself. Let’s begin with the DllMain part:

BOOL WINAPI DllMain(
  HANDLE hinstDLL,
  DWORD dwReason,
  LPVOID lpvReserved
)
{
  if(dwReason==DLL_PROCESS_ATTACH)
  {
    CreateThread(NULL,0,(LPTHREAD_START_ROUTINE)MessageOnlyWindow,NULL,0,NULL);
  }
 return TRUE;
}

Nothing really interesting, just creating a thread beginning in the MessageOnlyWindow function. Note that the CreateThread function is called only once, right after the application is launched (since the DLL_PROCESS_ATTACH parameter is passed to every module right before the
program’s EntryPoint is called). Let’s go a step further:

DWORD MessageOnlyWindow(LPVOID arg)
{
  MSG msg;
  WNDCLASS wndclass;

  wndclass.style = 0;
  wndclass.lpfnWndProc = MainWndProc;
  wndclass.cbClsExtra = 0;
  wndclass.cbWndExtra = 0;
  wndclass.hInstance = GetModuleHandle(0);
  wndclass.hIcon = NULL;
  wndclass.hCursor = 0;
  wndclass.hbrBackground = 0;
  wndclass.lpszMenuName = NULL;
  wndclass.lpszClassName = TEXT("TotalcmdBringToTop"); 

  if(RegisterClass(&wndclass) == 0)
    return FALSE; 

  if(CreateWindow(TEXT("TotalcmdBringToTop"), TEXT("TotalcmdBringToTop"), 0,CW_USEDEFAULT, CW_USEDEFAULT,
                  CW_USEDEFAULT, CW_USEDEFAULT, HWND_MESSAGE, NULL, GetModuleHandle(0), NULL) == NULL)
    return FALSE;

  while(GetMessage(&msg, NULL, 0, 0))
  {
    TranslateMessage(&msg);
    DispatchMessage(&msg);  }
  return msg.wParam;
}

It actually looks like a standard function creating a window (do we want to create any window? ^_*) and executing an event-handling loop.
The only interesting thing here is the HWND_MESSAGE contant passed as the CreateWindow argument. It indicates that we want to create a Message-Only window.

Such windows are usually created to handle some events that are not related to a particular window itself (read more in [2]). In this case, I used it to deal

with the WM_HOTKEY events generated everytime a keyboard hotkey is used. Everything should become clear after seeing the last part of the library code:

LRESULT CALLBACK MainWndProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam)
{
  switch (uMsg)
  {
    case WM_CREATE:
      RegisterHotKey(hWnd,0x1337,MOD_ALT,VkKeyScan('1'));
      break;

    case WM_HOTKEY:
      {
        HWND hMainWnd = FindWindow("TTOTAL_CMD",NULL);
        SetForegroundWindow(hMainWnd);
        ShowWindow(hMainWnd,SW_SHOWMAXIMIZED);
      }
      break;

    case WM_CLOSE:
      UnregisterHotKey(hWnd,0x1337);

    default:
      return (DefWindowProc(hWnd, uMsg, wParam, lParam));
  }
  return 0;
}

What can be seen here is a hothey being registered and given the 0x1337 identifier, during the window initialization (RegisterHotKey @ MSDN). The new hotkey is associated with the current window through the hWnd handle passed as the first argument. Since now, we’re guaranteed to receive a WM_HOTKEY signal when the defined (MOD_ALT+1) combination is pressed. When the callback function receives such an event, the TotalCmd window handle is obtained (TTOTAL_CMD is the application window’s class), and then used to maximize and set focus on the main window. What should be noted is that the

FindWindow method is quite unreliable, since it will find only one window handle, which is a problem when there are many Total Commander instances running on one machine. We’ve got a design problem now: which of the windows should be chosen to focus on? The code can be easily extended to perform more advanced actions, this one is just a concept of how an application behaviour can be customized to fit our needs. When something goes wrong and a WM_CLOSE event occurs (for example, when the TotalCmd itself decides to exit), we use the 0x1337 ID to unregister our hotkey from the system.

After putting everything into one .cpp file, we can eventually compile the hack:

19:46:11 Vexillium> g++ dllmain.cpp -o dllmain.o -c
19:46:35 Vexillium> dllwrap --def version.def -o version.dll dllmain.o --driver-name g++
19:46:40 Vexillium> strip version.dll

The last thing to do is to copy the fake version.dll file to \totalcmd directory and do the same with the original VERSION module from \Windows\System32 (renaming it to myVersion.dll in the meanwhile). When copied, we can launch the totalcmd.exe executable and use the ALT+1 hotkey everytime we want to get back to totalcmd window.

The original code package can be downloaded from here.
Have a nice evening!

PS. The previous post has been updated – as I promised, a Proof of Code package can be downloaded now (link).

References & Links

  1. Exported functions that are really forwarders
  2. http://msdn.microsoft.com/en-us/library/ms632599(VS.85).aspx#message_only
  3. Dll Spoofing in Windows
  4. DLL forwarding is not the same as delay-loading
  5. An In-Depth Look into the Win32 Portable Executable File Format, Part 2
  6. RegisterHotKey Function


1. Introduction

The first technical post here is about the process of terminating applications on Windows system. I have been researching this subject for the last few days, during which a number of interesting (yet unknown) facts has appeared. Some of the solution ideas regarding particular problems are presented here, though I am sure there are many nice ways of dealing with those – feel free to post your ideas below ;>

Note: the post will be supplied with the Proof of Concept code in a few days, to present the real-life usage of the described techniques.

The PoC package link is available on the bottom of the post.

2. Background

OK then, let’s start with some basics. Usually, the termination process seems to be so essential in our everyday life that hardly anyone wonders what is actually happening inside the system. Programs launch, programs exit, seems like no interesting actions are performed. After a quick investigation, it turns out that it is not as simple as it could pretend to be.

We can divide the termination into four groups – 2 factors are considered here:

  • Is the application being closed GUI or console based?
  • Is the application being closed from inside or by an external process?

Let’s begin with the first division. What shouldn’t be a suprise to anyone, the nature of program interaction with the user affects the possibilities of letting it know it is supposed to exit. When it comes to the first case (graphical interface), a great number of process resources is used (dialog boxes, strings, images, sounds etc etc).

The process is able to control the usage of these resources – it has its own event dispatching loop and is being provided a really wide API choice, thus having full control over how its windows look like, behave and so on. Messages related to application termination are available in WINAPI and widely used. In the console’s case, the program doesn’t have a real control over the console object and can only customize its look and behavior using system API.

To be more precise, all the process-specific termination stuff is common for both console and GUI programs. There are, though, some subsystem-specific mechanisms concerning only one type of the application (like window events etc). I will come back to this subject in a moment, let’s take a look at the second division now. There is a tremendous difference between an application being closed by itself or by an external process. Even though Microsoft has developed proper functions for both actions, there are some important differences between what they are designed

to be used to.

What should be noted here is that one process being killed by another one indicates some kind of error, in most cases. Under normal circumstances, all of the applications present on a local machine should be responsible for ending the execution at the right time (after finishing the work or being given a signal by the user). The program should never be surprised of user willing to close it – handling the WM_CLOSE and similar events as well as any other termination signals should be done by the program itself. It is actually the only way to ensure everything is cleaned up before the execution ends – not only does the process have to do the cleaning – it has a number of external modules loaded at runtime – these modules want to call their unititialization routines as well.

3. Notifications on process termination

The standard WINAPI function for terminating the current process is ExitProcess. The best choice would be to call it everytime we want to exit (returning directly to kernel32.dll is also fine most of the times). We are guaranteed to have the modules EntryPoint’s called with the DLL_PROCESS_DETACH dwReason value – we receive a legal termination notify so that we can save the current process state etc etc. Everything seems fine so far, ExitProcess always cares about any callback functions and stuff (which doesn’t mean everything is fine with the process, see [1]).

OK then, what about a situation when the user has no clue how to terminate a process (no GUI/console windows available on the desktop), but anyway wants to get rid of it (which is a very common situation, in fact)? Here comes the TerminateProcess function, being apparently far more brutal then the previous one. Its definition is as follows (from MSDN):

BOOL WINAPI TerminateProcess(
     HANDLE hProcess,
     UINT uExitCode
);

As one can see, it takes an already opened HANDLE to the target process, as one of the arguments. As MSDN states, the TerminateProcess function should be used to unconditionally cause a process to exit. This means that there no notifications being passed to the process about its state – all of its threads are terminated immediately and any pending I/O operations are requested to cancel. This is what actually makes this method brutal: the target has no chance to know it is being closed, thus making it impossible to recreate the process or perform any other reaction to what’s happening.

As usual, theory is theory and life is life – there are some types of software that would be particularly interested in being able to take advantage of an unkillable process cons. Yep, this could also be malware. Some ideas of how to get some kind of termination notification have been observed in the wild and described by AV companies. Here the list of found-on-the-net and discovered-by-myself btechniques follows:

  • Creating a system-wide hook to get notified – Setting up a global hook (for all the processes we can access) is nothing more but mapping our DLL library to the address space of the hooked processes. This subject is covered in more detail in [2] – the only thing the attacker should be aware of is that if we set up such a global hook, we must get notified in case of the main hooking process being terminated – it would be a pity if the process got killed but its hook would be still active in the context of other applications. Given such a situation, Windows decides to call the standard DllMain function in the context of every process we hooked (using the standard DLL_PROCESS_DETACH argument).This idea is claimed to be used by some real malware, as a way of bypassing what the TerminateProcess is supposed to provide – immediate termination without any notification that could help the process to prevent the action. Since setting up a system-wide hook requires the process to work under special privileges that a normal user doesn’t usually possess, the trick is restricted to a situation where we have the appropriate rights – the machine must have already been compromised.
  • Creating a system-wide hook to modify the TerminateProcess function – If we’re able to set up such a hook, consequently injecting our library to every possible process, we could use this library to find the processes TerminateProcess API addresses and modify the functions in such a way to make it unable to close our protected process. This technique is rather unreliable as there could still be some processes we couldn’t hook and thus would be able to kill our program (i.e. processes owned by a more privileged user).
  • Installing a hook on the ZwTerminateProcess kernel-mode function – This one presents pretty the same idea as the one described above, in fact. The difference is that the modification regards the ring0 code, making it the most reliable technique I know. See – if any process wanted to kill ours, than no matter what API functions and tricks it used, it would eventually end up in the ZwTerminateProcess system call, after all. There isn’t much code to write, as well – we just want to filter out the calls referencing our executable, which should not make a big problem. This technique has already been described in Process Invincibility [3].
  • Using standard debugger notifications – If we only want to get informed about our process being killed (in order to recreate it etc), we can use a parent-child debugging scheme. An example of how it would work in practise follows:
  1. Process debugger.exe is launched at some point.
  2. Debugger.exe launches child.exe using a DEBUG_PROCESS flag, indicating the new process is going to be debugged by its parent
  3. Debugger.exe passes the execution to its child, waiting for a debugging event
  4. The user decides to kill child.exe process and uses the TerminateProcess function
  5. Child.exe gets closed (or at least suspended), and an EXIT_PROCESS_DEBUG_EVENT signal is passed to Debugger.exe.
  6. Debugger.exe retrieves the child process state and lets the system complete the termination.
  7. Debugger.exe re-creates child.exe and writes the process state back.
  • As you can see, the whole trick takes advantage of the fact that the parent (debugger) process gets notified of what’s going on with its child. However, the problem is that another process gets involved in the whole action – debugger.exe is yet a normal process and can be killed as any other. Someone could think of creating a self-debugging process that could (maybe) get informed about its termination. This is kinda naive approach, and the following code listing taken from the WRK (\ntos\dbgk\dbgkobj.c) should deprive the reader of any doubts:
    //
    // Don't let us debug ourselves or the system process.
    //
    if (Process == PsGetCurrentProcess () || Process == PsInitialSystemProcess) {
        ObDereferenceObject (Process);
        return STATUS_ACCESS_DENIED;
    }

    We don’t have to create a special debugger process to keep being notified, though. We can still use some accessible system processes that are present on the machine 99% of times. In my opinion, the best idea would be to use explorer.exe, since it is the standard user shell process, running in the context of current user. The idea is to inject a special thread into the explorer, that would play the debugger.exe’s role. However, since we’re able to create a remote thread in the context of explorer.exe, some other process could remove it as well. The method is not perfect, though makes the whole termination work much harder for an average Windows user (re-creating the explorer process would also work).

As can be seen, most of the ideas require some particular process privileges. The perfect situation would occur if our process was able to get notified about being terminated, without any additional memory hacks/processes involved. New ideas are welcome!

4. Closing applications the nice way

OK then, having a few ways of bypassing the TerminateProcess functionality, let’s take a look at the problem from the opposite side: we want to close some application from the outside in a nice way, so as it could perform the standard uninitialization actions. As far as I know, there is no documented API that would be appropriate in such situation. There are, though, some tricks that make it possible to cleanly close an application under any conditions, in most cases (check [4]). However, these tricks are subsystem-dependent, thus have to be described separately. Let’s begin with the console applications.

4.1 Console Ctrl+C event

As have already been written, the text-mode application doesn’t have full control over the console window look and events handling. Although, there are some actions that make CSRSS (the process responsible for handling low-level console events) send specific signals back to the ‘client’ application. To be more precise, there are a few callback functions registered by kernel32.dll during console allocation/attachment, that are called by the WinSrv (one of the CSRSS components) in the context of particular process (this subject is going to be described in detail in one of the upcoming ‘CSRSS internals’ posts). A Win32 programmer can take advantage of the fact that the CTRL+C event is one of those being signalised to the process itself. Microsoft has developed and documented a

BOOL WINAPI SetConsoleCtrlHandler(
    PHANDLER_ROUTINE HandlerRoutine,
    BOOL Add
);

function [5], responsible for registering/removing a new Ctrl+C callback function. The entire mechanism is pretty easy from a programmer’s point of view: the HandlerRoutine is called by the kernel32.dll every time a Ctrl+C or Ctrl+Break combination is used (it is also being called after closing the console window manually). If our program needs to perform some additional cleanup apart from what the ExitProcess performs, we can install such a callback and do what’s supposed after someone aims to close our console window. There is one, default kernel32 handler installed at the beginning, which does nothing more but calls the ExitProcess function using a STATUS_CONTROL_C_EXIT constant as the exit code.

If we want to terminate a console application in a nice way, we can generate a Ctrl+C event in the context of the application – it would result in its immediate termination in 99% cases. This can be achieved using a GenerateControlCtrlEvent function, which description I encourage you to read. Despite the presence of default kernel32 handler, we cannot be 100% sure the process will terminate after receiving a Ctrl+C signal – some console hacks can be used inside the program not to let it get closed using any keyboard-combinations and related techniques. I think it would be a reasonable solution to generate a Ctrl+C event, wait a few seconds to see whether the target closes itself or not, and take harsher actions in case the process is still alive.

4.2 CreateRemoteThread(ExitProcess)

When it comes to GUI applications, we have no chance to use the CSRSS console features unless the program is both graphical and text-mode (which isn’t a very common situation, anyway). One way or another, we can always try to create a remote thread that would do the job in the context of our target. CreateRemoteThread takes a function address as one of its parameters – the new thread’s EntryPoint. The given function is supposed to be of the following type:

DWORD WINAPI ThreadProc(
    LPVOID lpParameter
);

It means that knowing the address of an API function in the target’s address space, and provided the function takes exactly one argument, we can make the new thread start directly inside a system function. What must be remembered is that the CreateRemoteThread function creates a thread using the given address argument – the address MUST point to a valid memory area (simply a valid function) in the target’s addressing. If we wanted to launch our own function in the context of another process, we would have to allocate some executable memory at first,copy the desired code inside, and eventually use the new allocation address as the thread API argument.

What makes the whole trick so easy is that the ExitProcess definitions matches ThreadProc:

VOID WINAPI ExitProcess(
    UINT uExitCode
);

It takes one argument (which can be passed to the thread using a CreateRemoteThread parameter), so the stack frame would not get damaged if we started the thread in the function itself. What could be a little confusing is that ExitProcess does not return any value, while ThreadProc is supposed to. Yeah, this is generally right, but what should be noted is that the function never returns, thus making the return type/value irrevelant.

4.3 DebugActiveProcess + ExitProcess

There is one more particular issue actually worth being mentioned. In spite of the standard TerminateProcess brutal termination technique, one can use a “DebugActiveProcess + Quit-the-debugger” method. Let’s assume we have two running processes: A and B. Process A calls the DebugActiveProcess function using a HANDLE to the second program, thus becoming its debugger. There’s a default setting present in every debugger process – everytime the dbg is shutdown, so are the debugged processes (this characteristic can be easily changed using the DebugSetProcessKillOnExit API function). So, if the A process suddenly decides to quit, B gets (silently, just like TerminateProcess) terminated, too. This little technique is claimed to be a good alternative for standard termination API, sometimes even more effective (according to deus).

5. Conclusion

To be honest, the simple-and-friendly process termination mechanism is way more complex and advanced than my first thoughts were. A number of other, different issues (not even mentioned here) have already been discussed (check out the References section).

The termination itself is connected very tightly with some of the CSRSS functionalities that are going to be described soon – stay up to date!;)

The post will be updated with some Proof of Concept code snippets in a few days.

Proof of Concept code can now be downloaded from here.

Take care.

6. References & Links

  1. Quick overview of how processes exit on Windows XP
  2. Techniques of Adware and Spyware
  3. Process Invincibility
  4. How To Terminate an Application “Cleanly” in Win32
  5. SetConsoleCtrlHandler Function
  6. The arms race between programs and users
  7. Why do some process stay in Task Manager after they’ve been killed?
  8. The old-fashioned theory on how processes exit