Jul 20, 2016

Representation of data types in memory - Part 2

This is the continuation of the series "Representation of data types in memory". In this part, I will investigate how signed numerical integers are stored in memory. I'm using Windows Vista 32 bit with Microsoft Visual C++ 2010 Express (in Debug Mode) and WinDbg. Note that I'm not considering C++11.

Well, in my previous post, we saw how the unsigned numerical integers were saved in little-endian format in a block of 0xCC. This is of course true for signed numerical integers as well, but signed integers need to take the sign into account as well.

I will start this post in a similar way like my previous one, by using the same simple program, but with signed numerical integers.
int main()
{
   int a = -1;
   int b = -2;
   int c = -3;
   int d = -4;

   return 0;
}
When starting WinDbg and execute all the initializations statements, we see the result below.

WinDbg - Memory view after initializations
As discussed before, in the Memory view, we can see the blue rectangle which indicates the inserted block of 0xCC, which typically is done in Debug mode. The blue arrow shows the offset of the Memory view, which is where the insertion of 0xCC starts. Each integer is also "guarded" by four bytes 0xCC. Now we will focus on how the signed numerical integers are saved in memory. This is done by using the two's-complement representation. What is two's-complement? You can read all about it on the net, but here is a short explanation cited from wikipedia.
"Two's complement is a mathematical operation on binary numbers, as well as a binary signed number representation based on this operation. Its wide use in computing makes it the most important example of a radix complement."
There are a lot of in-depth information on the net how two's-complement conversion is done, here is my short version;

Invert all bits of the (positive) number and add one (+1).

Let's use an actual value from our simple program as an example, for instance;
int a = -1;
Positive number: 110
It's an int, i.e. four bytes in size -> 110 = 0000000116.
Then invert all bits; 0000000116 -> FFFFFFFE16
and then add one (+1); FFFFFFFE16 -> FFFFFFFF16

In other words, the number FFFFFFFF16, represent -1 in two's complement representation.

We can see this number in the Disassembly view, more specific, we can see this instruction;
mov     dword ptr [ebp-8],0FFFFFFFFh
It means that -1 is represented as 0xFFFFFFFF and will be stored at memory address ebp-0x8. Taking little-endian into account, each byte will be saved according to below;

ebp-0x8: 0xFF
ebp-0x7: 0xFF
ebp-0x6: 0xFF
ebp-0x5: 0xFF

Now let's continue with a similar program as in the first part in this series, this time with signed numerical integers.
int main()
{
   char a = -1;
   short int b = -2;
   int c = -3;
   long int d = -4;

   return 0;
}
When starting WinDbg and execute all the initializations statements, we see the result below.
WinDbg - Memory view after initializations

Like the first simple program, the blue arrow and blue rectangle, shows the inserted 0xCC done by the rep stos instruction.
Let's investigate how each type is stored. We know they are stored in two's-complement little-endian format.

Below, I've used following shorthand to denote the two's complement conversion for each type;

Positive number (using correct size) -> Invert all bits -> Added 1

char a = -1;
0116 -> FE16 -> FF16

ebp-0x5: 0xFF
short int b = -2;
000216 -> FFFD16 -> FFFE16

ebp-0x14: 0xFE
ebp-0x13: 0xFF
int c = -3;
0000000316 -> FFFFFFFC16 -> FFFFFFFD16

ebp-0x20: 0xFD
ebp-0x1F: 0xFF
ebp-0x1E: 0xFF
ebp-0x1D: 0xFF
long int d = -4;
0000000416 -> FFFFFFFB16 -> FFFFFFFC16

ebp-0x2C: 0xFC
ebp-0x2B: 0xFF
ebp-0x2A: 0xFF
ebp-0x29: 0xFF

You are welcome to leave comments, complaints or questions!

Jul 14, 2016

Representation of data types in memory - Part 1

Normally we don't need to bother how fundamental data types are stored in memory. We just expect it to work. Recently I became curios, so I learned the details of how a fundamental data type is represented in memory, especially on the stack. As usual, I'm using Windows Vista 32 bit with Microsoft Visual C++ 2010 Express and WinDbg. Note that I'm not considering C++11.

I've divided this topic into several posts. In the first one, I will focus on one group of fundamental data types; Numerical integer types, and more specific, the unsigned ones. In following posts, I will consider the signed numerical integers, floating types and other aspects.

Well, let's look into the numerical integer types. You probably know the integer types already. Below is a recap cited from www.cplusplus.com.
"Numerical integer types:
They can store a whole number value, such as 7 or 1024. They exist in a variety of sizes, and can either be signed or unsigned, depending on whether they support negative values or not."
Let's start with a very simple program, here we just use the unsigned integer type. When compiling this program in Debug mode, and executing it, how is the unsigned integers represented in memory?
int main()
{
   unsigned int a = 1;
   unsigned int b = 2;
   unsigned int c = 3;
   unsigned int d = 4;

   return 0;
}
When starting WinDbg and execute all the initializations statements, we see the result below.
WinDbg - Memory view after initializations

The blue arrow above, indicates that I've set the Memory view to ebp-0x0F0 according to the Disassembly view. Before any initializations has been done, a block of memory is initialized to 0xCC thanks to the rep stos instruction. This is typically done in Debug mode. The rep stos instruction starts inserts 0xCC at ebp-0f0.

Further, we can also see that each initialized integer is "guarded" by four bytes of 0xCC (before and after each integer).

The x86 architecture is using the little-endian format. I will briefly explain the little-endian format here. If you want more in-depth information, just search the net.

The endianness describes how a sequence of bytes are stored in the memory. It means that the endianness only matters for data types with more than one byte in size. Little-endian means that the Least Significant Byte (LSB) is stored at the lowest address and the Most Significant Byte (MSB) is stored at the highest address.

From the example above, we are dealing with integers, so let's use integer as an example. An integer is four bytes in size, this can be seen in the Memory view. An integer can be written like "ByteA ByteB ByteC ByteD", where ByteA is the MSB and ByteD is the LSB. The memory will look like this when we are using the little-endian format.

Base Address: ByteD
Base Address + 1: ByteC
Base Address + 2: ByteB
Base Address + 3: ByteA

Let's make the example above more realistic by using an actual value from our simple program, for instance;

   unsigned int a = 1;

As we know, the unsigned int is four byte in size, so the number will be 0x00000001. According to the Disassembly view, the statement "unsigned int a = 1;", will be saved on the stack at memory address ebp-0x8. The LSB i.e. 0x01 is saved at memory address ebp-0x8, and the other bytes is saved according to below.

ebp-0x8: 0x01
ebp-0x7: 0x00
ebp-0x6: 0x00
ebp-0x5: 0x00

The example above, was only dealing with the unsigned integers. Now we move on to another simple program, which shows the data representation of each unsigned numerical integers; char, short int, int and long int.
int main()
{
   unsigned char a = 1;
   unsigned short int b = 2;
   unsigned int c = 3;
   unsigned long int d = 4;

   return 0;
}
When starting WinDbg and execute all the initializations statements, we see the result below.
WinDbg - Memory view after initializations
Like the first simple program, the blue arrow and blue rectangle, shows the inserted 0xCC done by the rep stos instruction.

Let's investigate how each type is stored. We know they are stored in little-endian format, meaning the LSB is stored in the lowest address.

   unsigned char a = 1;

ebp-0x5: 0x01

   unsigned short int b = 2;

ebp-0x14: 0x02
ebp-0x13: 0x00

   unsigned int c = 3;

ebp-0x20: 0x03
ebp-0x1F: 0x00
ebp-0x1E: 0x00
ebp-0x1D: 0x00

   unsigned long int d = 4;

ebp-0x2C: 0x04
ebp-0x2B: 0x00
ebp-0x2A: 0x00
ebp-0x29: 0x00

From the results above, we can note that the unsigned char is only one byte in size, so the endianess format does not matter. We can also see that both unsigned int and unsigned long int is four byte in size.

You are welcome to leave comments, complaints or questions!

Feb 7, 2016

DLL loading and initializing

A couple of years ago, I was developing a Windows application using Borland's IDE. This is a Rapid Application Development (RAD) tool. It is very easy to create a Windows application by just clicking the controls you want, and add the code for the event you want.

I realized that my Windows application appearance was very boring comparing to other applications. I learned that my application was using the old Standard theme, and the other ones were using Visual Styles. After some researching, I was able to take advantage of the Visual Styles as well, by adding a manifest to my application. The secret was to load version 6 (or higher) of comctl32.dll (and not version 5).

A couple of weeks ago, I started to dig into the details of the loaded comctl32.dll and the role of the manifest. What was inside version 6 of comctl32, which was not present in version 5? During my research, I learned how and when the DLL was loaded into the address space of the process, and further, I learned how and when DLLs in general are loaded into the address space and initialized. This is the subject for this post. In another post, I will expand the discussion and go into details for comctl32.dll, Visual Styles, atom tables and Activation Context.

Let's start to check out a very simple Windows application, it is built in Borlands IDE in Debug mode, using only Win32 DLLs (not Borlands Dynamic libraries).

A simple application in the IDE

A simple application
This application is clearly not using Visual Styles. It is using the old Standard theme, it has the old boring 3D style.

Now let's check out which DLLs are loaded are loaded into the address space of the process for this simple application. We can easily see them, using one of my favorite tool, Process Explorer.

A simple application with its DLLs in Process Explorer
Amazing, so many DLLs needed just for a simple window with a button.

When I saw the loaded DLLs for the first time, the first question which came to my mind was why comctl32.dll is loaded twice? With different versions? Within this process, comctl32.dll was loaded both as the 5.82- and 6.10- version.

To simplify this post, I will only deal with the 5.82 version of comctl32.dll, which is responsible for the old boring Standard theme. As far as I can tell at the moment, the 6.10 version of comctl32.dll is needed for the non-client area of my simple application. This can be proven by removing the border from my simple application. Just set the BorderStyle property to bsNone in the Borland IDE.

Non-client area of my simple application (border)


BorderStyle property in the IDE


A simple application without border (only client area)
A simple application without border with its DLLs in Process Explorer
Above, we see that only the 5.82 version of comctl32 is loaded. As mentioned before, the 5.82 version of comctl32.dll is responsible for the old boring 3D style. Further, we can see that shlwapi.dll is also missing. To conclude, my simple application without border, obviously has a need of total 19 DLLs for various reasons.

From now on, I will refer my simple application without border, to just "button.exe".

Alright, now we have concluded that button.exe needs 19 DLLs loaded into the address space of the process. Let's look at some properties of these DLLs.

First, I will look into which DLLs are implicitly linked. We can do that by wake up my good old friend Dependency Walker. Dependency Walker will show us which DLLs are implicitly linked to button.exe.


Implicitily linked DLLs (and their dependencies) in Dependency Walker
Above, these are the 7 DLLs, which are implicitly linked, i.e. they are defined in the Import Name Table (INT) of button.exe. One of my previous post is discussing the INT for notepad.exe, you may want to check it out here.

We can conclude that none of the 7 DLLs are delay loaded (from button.exe point of view). However, some of the seven DLLs has dependencies on other DLLs, which obviously are implictily linked and delay loaded. For instance, user32.dll has three delay loaded DLLs, which is indicated by the hourglass (msimg32, powrprof, windsta). For information about delay loaded DLLs, you may want to check out this link.

The next property I'm going to investigate, is if the DLL has an embedded manifest.
I have not used a manifest for button.exe. However, any of the loaded DLLs in the address space of the process may use a manifest. One way to view the embedded manifest, is to use the sigcheck tool from Sysinternals.

Sigcheck from Sysinternals can be used to view embedded manifest. Advapi32.dll has no embedded manifest.

Below is a summarize of what we have concluded so far from the DLLs within the address space of the process button.exe. The first table is showing the implicit linked DLLs, and the second one, the DLLs loaded at run-time for various reasons.

DLL name Version Path Embedded manifest
advapi32.dll 6.0.6002 System32 No
kernel32.dll 6.0.6002 System32 No
version.dll 6.0.6002 System32 No
comctl32.dll 5.82.6002 WinSxS No
gdi32.dll 6.0.6002 System32 No
user32.dll 6.0.6002 System32 No
oleaut32.dll 6.0.6002 System32 No

DLL name Version Path Embedded manifest
clbcatq.dll 2001.12.6931 System32 No
fshook32.dll - - No
imm32.dll 6.0.6002 System32 No
lpk.dll 6.0.6002 System32 No
msctf.dll 6.0.6002 System32 No
msvcrt.dll 7.0.6002 System32 No
ntdll.dll 6.0.6002 System32 No
ole32.dll 6.0.6002 System32 No
psapi.dll 6.0.6000 System32 No
rpcrt4.dll 6.0.6002 System32 No
usp10.dll 1.626.6002 System32 No
uxtheme.dll 6.0.6001 System32 Yes


I've chosen to not investigate fshook32.dll further, since this DLL is not really a Windows DLL, it is just a DLL which is hooked into the process. And as a matter of fact, fshook32.dll is dependent on psapi.dll, so that's is the reason why psapi.dll is loaded into the address space.

So now we know which DLLs are loaded into the address space of the process. The next thing to investigate is when they are loaded.

This can easily be seen in WinDbg. When firing up WinDbg with button.exe, we see that all the 19 DLLs are loaded into the address space of the process.


WinDbg console output when running my simple application
First we can see that button.exe is loaded into the address space. Then follows a set of loaded DLLs (before the first chance exception). These are the implicit linked DLLs, including their dependent DLLs. For instance, rpcrt4.dll is not implicitily linked to button.exe, but advapi32.dll is. Since advapi32.dll is dependent on rpcrt4.dll, rpcrt4.dll is loaded as well.

After the first chance exception, a new set of DLLs is loaded into the address space. These are DLLs which are explicitily loaded, including their dependent DLLs.

I will now summarize in a timeline when a DLL is loaded, and when its entry point function is executed. To find out the exact order each DLLs loading and initializing, I will set a breakpoint in ldrpmapdll in ntdll.dll. When this function is executed, the DLL is loaded into the address space of the process. When I know the DLL is loaded, I'm able to set a new breakpoint in entry point function of the DLL to find out when it is initialized.

Instead of setting a breakpoint in ldrpmapdll, I could have set one in LoadLibrary instead. However, LoadLibrary is loading the DLL into the address space as well as initialize it.

Note that ldrpmapdll is only executed if the DLL is not loaded into the address space. The program can call for LoadLibrary several times with the same DLL, but if the DLL is already loaded, ldrpmapdll will not be executed. In other words, ldrpmapdll is only executed once for each DLL. In my research below, I am only taking notice when the DLL is loaded for the first (and only) time.

I will also set breakpoint in button's entry function as well as in WinMain. However, I'm not able to use symbols in WinDbg for my Borland executable, so I insert one in the code.

Added debugbreak in button's WinMain code
Alright, that is how the procedure will look like, it is now time to actually do this.

Let's start and open button.exe in WinDbg and set a breakpoint in each loaded DLL's entry point after the first chance exception.

Using !dlls in WinDbg

When using the !dlls command, we can see the entry point for each loaded DLL. At this moment, only implicit linked DLLs (and their dependent DLLs) has been loaded. In the screenshot above, we also see the loaded button.exe, which entry point is 0x00401374.

We can see that ndll.dll has an entry point at address 0x00000000, which seems strange. I don't really know what this means, but I can guess ntdll does not have a regular entry point. If you know, please let me know.

So know let's set a breakpoint in each entry point function as well as the ldrpmapdll function.

Breakpoints in WinDbg
Remember that I've set a breakpoint (in code) in button.exe.

Now it is time to start step through the execution.

In the beginning, nothing particular is happening, rpcrt4, advapi32, msvcrt, version is initialized. Then it is time for user32 to be initialized and the entry point function UserClientDLLInitialize is called. During this initialization code, a lot of stuff is happening.

First, the debugger breaks in the function ldrpmapdll. Apparently, user32.dll entry point functions calls for a function _InitializeImmEntryTable, which calls for the LoadLibraryW function.
When stepping further, we can see that imm32.dll is loaded into the address space. Immediately after the DLL is loaded, I use the !dlls command again to find out the entry point and set a breakpoint. This is shown below.


Callstack in WinDbg for ldrpmapdll
 

imm32.dll is loaded into the address space




Setting a breakpoint when imm32.dll is loaded into the address space
When stepping further again, another break appears in the ldrpmapdll function. When stepping further, we will see that msctf.dll is loaded into the address space. As I did with imm32.dll, I will immediately set a breakpoint in msctf's entry point function.

This time we have loaded an implicit linked DLL to imm32.dll into the address space


msctf is loaded into the address space



Setting a breakpoint when msctf.dll is loaded into the address space
At this moment, two additional DLLs has been loaded into the address space; imm32.dll and msctf.dll. Msctf.dll was loaded because imm32.dll was dependent on it.

When stepping further, following two breakpoints is observed.

Entry point for msctf.dll is executed

Entry point for imm32.dll is executed

We can see that the two newly loaded DLLs are initialized in reversed order. Msctf.dll was loaded after imm32.dll, but msctf.dll was initialized before imm32.dll.

When stepping further, we will see that lpk.dll is loaded into the address space, as well as the DLLs it depends on (in this case lpk.dll depends on usp10.dll). This is the same procedure as was done for imm32.dll and msctf.dll, so I will not go into detail.

We are about to load lpk.dll

As I did for imm32.dll and msctf.dll, I've set breakpoints in their entry points function. I can conclude that lpk.dll and usp10.dll is loaded and initialized in the same manner as imm32.dll and msctf.dll.

Finally, we are finished with the entry point for user32.dll.

When stepping further, WinDbg breaks in gdi32, comctl32.dll, ole32, oleaut32 initialization functions.

The next break appears in our button entry point function.

Break in button.exe entry point
Once I've entered the entry point function, I remove the breakpoint. I've encountered some strange behavior when keeping this breakpoint for unknown reasons. The callstack has ended up in the kernel part of the memory space.

Removing the breakpoint in button.exe entry point function
Stepping further, we will see that two additional DLLs are loaded into the address space. These are fshook32.dll and psapi.dll. They are loaded and initialized the same way as imm32.dll and msctf.dll.

When fshook32.dll and psapi.dll are loaded and initialized, another break appears in ldrpmapdll. This time it is uxtheme.dll. It is loaded and initialized.

Stepping further again, we have finally reached the WinMain function in button.exe. Below, we can see the code I've manually added before.

Breakpoint in WinMain in button.exe

When WinMain is executed, another break appears in ldrpmapdll.

Clbcatq.dll is delay-loaded

Above, I am not showing the complete callstack, but we can understand that this DLL is delay-loaded, thanks to the __delayLoadHelper2 function in the callstack. When stepping further, we see that clbcatq.dll is loaded into the address space.

The fact the clbcatq.dll is delay-loaded, can be proven by checking Dependency Walker and open ole32.dll.

Clbcatq.dll is a delay-loaded DLL.
Clbcatq.dll was the final DLL to be loaded, so know our application has finally started, and we can see the window on the screen.



button.exe has finally created the window visible on the screen.

To conclude the flow of loaded/initialized DLLs, I've drawn a simple flow diagram.
Flow of the loaded/initialized DLLs
The timeline starts at the button (the green cylinder), and ends at clbcatq (red rectangle). I was not able to create a straight timeline, since it was too many events, so I had to do a timeline with some turns.

As we can see, we start by loading button.exe into the address space, then follows a set of DLLs loaded into the address space. Rpcrt4.dll is the first DLL which will be initialized (execution of the entry point function). When user32.dll is initialized, a lot of stuff is happening, so that's why I've drawn a red border around some blocks, to show that all these DLLs are processed within the user32 initialization code.

Between button.exe entry point and WinMain, some additional DLLs are loaded and initialized. And finally, during WinMain execution, a dealy-loaded DLL is loaded into the address space.

Another interesting note; during the stepping in button.exe in WinDbg, a break was made in each entry point, except for kernel32.dll. I guess it is because kernel32.dll is initialized before we have a chance to set a breakpoint. Ntdll.dll and kernel32.dll are probably guaranteed to be loaded and initialized before anything else is loaded.

You are welcome to leave comments, complaints or questions!