Jan 2, 2015

Base relocation table

A Portable Executable (PE) file usually contains some headers and some sections. One of the sections, that may exist, is the .reloc section, and within this section is a base relocation table. The base relocation table is needed to fix up virtual addresses in the PE file if the PE file not was loaded at its preferred load address.

In this post, I will have a look at a base relocation table and see what's going on there. Further, I will discuss how to read the entries in the base relocation table, and what they are doing in the code. I'm using PE Insider from Cerbero, PEBrowserPro from Smidgeonsoft and WinDbg.

Before checking out a .reloc section in a PE file, let's discuss the .reloc section briefly.

The .reloc section contains a serie of blocks. There is one block for each 4 KB page that contains virtual addresses, which is in need for fix ups. Each block contains an IMAGE_BASE_RELOCATION struct and a serie of entries.

The IMAGE_BASE_RELOCATION is defined as below, according to the header file winnt.h.
typedef struct _IMAGE_BASE_RELOCATION {
    DWORD   VirtualAddress;
    DWORD   SizeOfBlock;
//  WORD    TypeOffset[1];
} IMAGE_BASE_RELOCATION;

The VirtualAddress holds a Relative Virtual Address (RVA) of the 4 KB page, which the relocation applies to. The SizeOfBlock holds the size of the block in bytes (including the size of the IMAGE_BASE_RELOCATION struct).

Each entry is a 2-byte value and represent a location (offset within the 4 KB page pointed out by the VirtualAddress member in the IMAGE_BASE_RELOCATION struct), which needs to be fixed up in case of the PE file is not loaded at its preferred load address.

That was some theory. Now let's take a look at a .reloc section from a simple "Hello world!" program. I've compiled and linked it in Visual Studio 2010 Express in Release mode.

#include <cstdio>

int main()
{
   std::printf("Hello world!");
}

Before proceeding, just a quick note. The program above lack the return 0 statement. This does not cause any trouble, the compiler actually add this statement. This can be seen in the disassembly views which follows below (xor eax, eax, ret).

Let's open the "Hello world!" program in PE Insider.

First, let's check out the preferred load address.

PE Insider - Preferred load address
The preferred load address is the ImageBase value, i.e. 0x00400000.

Next, check out the section table.

PE Insider - section table
Apparently there are five sections needed for my simple "Hello world!" program. We are only interested in the .reloc section, so lets check it out.

PE Insider - .reloc section
The .reloc section contains two blocks (within the red border). Since there is 2 blocks, it means that there is 2 4KB pages that contain virtual addresses which is in need for fix ups.

The first member in the IMAGE_BASE_RELOCATION in the first block is the VirtualAddress member, with the value 0x00001000 (green underscored value). The second member is the SizeOfBlock member, with the value 0x00000128 (black underscored value). In order to calculate the number of entries, we must subtract the the size of the IMAGE_BASE_RELOCATION struct from the SizeOfBlock member, i.e. 0x128-0x8 = 0x120 bytes. Further, we know that each entry is a 2-byte value, which gives the number of entries to 0x90 (DEC: 144).

The VirtualAddress value, in the second block, is 0x00002000 (green underscored value) and the SizeOfBlock value is 0x00000020 (black underscored value). Applying same idea as a for the first block, the number of entries is 0xC (DEC: 12).

So what is the meaning of the information above?

Regarding the first block, the loader understand it must do some fix ups in the 4 KB page that starts at RVA 0x1000. As you can see in the section table above, 0x1000 is where the .text section starts. In other words, the loader must fix up 0x90 (DEC: 144) virtual addresses in the .text section, i.e in the code. Note that the .text section is covered in 1 4 KB page.

Regarding the second block, the loader understand it must do some fix ups in the 4 KB page that starts at RVA 0x2000. As you can see in the section table above, this is the .rdata section. In other words, the loader must fix up 0xC (DEC: 12) virtual addresses in the .rdata section. Note that the. rdata section is covered in 1 4 KB page.

So how do the loader know how to fix up the virtual address, and where it should do it?

In order for the loader to know how to fix up the address, it takes the delta value of the preferred load address and the actual load address. The delta value is then added to the virtual address.

In order for the loader to know where to do the fix ups, it check out each entry in the blocks in the .reloc section. As mentioned above, an entry is a 2-byte value. The 4 most significants bits in the value, represents the type of relocation. In the x86 world, the type is always 3 (HIGHLOW), which means that we add the complete 32 bit value of the delta value to the virtual address that is in need for fix up. The remaining bits (12 bits) of the entry, tells us the offset in the 4 KB page (pointed out by the VirtualAddress member in the IMAGE_BASE_RELOCATION struct), i.e. where to do the fix up in the page.

Let's clarify this with an example. As mentioned above, the .text section will have 0x90 fix ups. Let's go in to detail for the first 2 fix ups.

According to the .reloc section above, the first entry is 0x3001. This means that the type is 3 and the offset, in the page, is 0x001. The second entry is 0x3007, which means that the type is 3 and the offset, in the page, is 0x007.

Thanks to these offset values, the loader now knows that it must fix up the virtual address at RVA 0x1000+0x1 = 0x1001, and RVA 0x1000+0x7 = 0x1007.

Below is the .text section on file, and the affected RVA's within the red border. I've also marked them as 1 and 2, cause I will deal with them seperatly in my explanation. I also use them as references between the screenshots.
PE Insider - .text section - HEXVIEW
So above, we see the binary machine code. This does not really tells us so much, so let's disassemble this part using PEBrowserPro.


PEBrowserPro - .text section - disassembled

Thanks to the disassembled view, we can easily see what's occurs on RVA 0x1001 and RVA 0x1007. Note that the view using virtual addresses (with the preferred load address 0x0040000).

Lets start with case 1, i.e the RVA 0x1001.

At this position, we can see the value 0xF4204000. This is a virtual address (0x004020F4, remember little-endian), which needs to be fixed up according to the .reloc section. Actually, it is a virtual address within the .rdata section. The string literal "Hello world!" is put in the .rdata section. In the disassemble view above, we push the address, to the string literal, on the stack. In case of the PE file not will be loaded at its preferred load address, this value 0xF4204000 needs to be fixed up.

Now let's move on to case 2, i.e. the RVA 0x1007.

At this position, we can see the value 0xA0204000. This is a virtual address (0x004020A0, remember little-endian), which needs to be fixed up according to the .reloc section. This is a virtual address within the .rdata section as well. More specific, the virtual address 0x004020A0 is a part of the Import Address Table (IAT). It can be verified by checking the Directories.

PE Insider - Directories

Each entry in the IAT contains virtual addresses to functions in other modules. In this case, the IAT entry at the virtual address 0x004020A0 contains a virtual address to the function printf in the msvcr100.dll module.

So when executing the CALL instruction (see the disassemble view), the processor finds out printf's virtual address by checking the virtual address 0x004020A0. Before load time, we don't know the virtual address of printf yet, because the DLL (msvcr100.dll) has not been loaded into virtual memory.

As you can understand, if the loader does not load the "Hello world!" program at its preferred load address, the .rdata section will not be located at 0x2000, and the entry, which tells us where to find the printf function is invalid. Therefore the loader must add the delta value to 0x004020F4.

Finally, I will just show what's going on in the .text section, when the "Hello world!" program is executed (loaded in memory). To this, I'm using WinDbg. Note that the "Hello world!" program is linked with the DYNAMICBASE switch, i.e. it's Address Space Layout Randomization (ASLR) compatible.

First, let's find out the load address.

WinDbg - Load address

Remember that according to the PE file, the preferred load address is 0x00400000. The actual load address is 0x00250000, which gives us a delta value 0x0040000-0x00250000 = 0x1B0000 (negative difference).

Second, let's check out the .text section


WinDbg - .text section - Disassembled view

As you see above, the virtual addresses differ from the disassembled view in PEBrowserPro, due to the "Hello world!" program is loaded at 0x00250000.

For instance, according to PEBrowserPro, the push instruction pushes the virtual address 0x004020F4 on the stack. Perfectly OK when the program is loaded at its preferred load address 0x00400000.
But in this case, the program is loaded at 0x00250000. The delta value is 0x1B0000 (calculated above). This is a negative difference. In other words, the loader must fix up virtual address 0x4020F4 by reducing it with 0x1B0000. This gives 0x004020F4-0x1B0000 = 0x002520F4. This is exactly what happened in the screenshot above.

You are welcome to leave comments, complaints or questions!

11 comments:

  1. Best post I've seen on the subject, thank!

    ReplyDelete
  2. great explanation, thanks!

    ReplyDelete
  3. Awesome! Thank you very much dear Anders.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Very well Explained, i understood at once, thanks!!!

    ReplyDelete
  6. thanks for this excellent article
    IMHO it is the best article about PE Relocation

    ReplyDelete
  7. Amazing, this was the best explanation ever. Thanks man, truly

    ReplyDelete