In my previous post, I made a graphical memory layout for the notepad process. I showed the complete 4 GB virtual memory space (or at least the user-mode virtual memory).
In this post, I'm going to do a graphical memory layout for the executable notepad.exe.
The file notepad.exe is a Portable Executable (PE) file. I'm not going to specify the PE format here, there is a lot of good resources on the net where you can find this information. However, I will briefly describe some parts of the PE file here.
Let's inspect notepad.exe. There are several good tools out there to do this. I'm using PE Insider from Cerbero. PE Insider will open the executable and show the file contents in a nice way. Below you can see a screenshot when I've opened notepad.exe. It shows the major parts which builds up the PE file.
When loading notepad.exe into virtual memory (i.e. when starting notepad), the headers and sections are copied into the memory. Exactly how, will be explained below. (To be correct, the information in the PE file are copied to the memory, but the loader may overwrite some information in the memory during load time, which means that the PE file and the memory may not contain exactly the same information. But that does not impact the memory layout.)
Regarding the headers, PE Insider gives us information according to the screenshot below. The headers take up 0x400 bytes of the PE file. However, this is not entirely true. It is actually rounded to this value since the file alignment is 512 bytes (0x200). Offset is the offset in the PE file and Size is the size (in bytes) of the value at this offset.
A PE file contains a couple of sections, more specific, in this case, notepad.exe contains following sections (screenshot from PE Insider). For now, just consider a section as a block of binary information.
Each section are copied from file into the virtual memory (by the loader) according to the VirtualAddress column. The Virtual Address column contains the Relative Virtual Address (RVA). It is an address relative to the image base address (image base address = where the image are loaded in the virtual memory). Since my Windows Vista 32 bit uses Address Space Layout Randomization (ASLR), we don't really know the image base address at forehand. But just let's take an example. During one execution of notepad, the image base address was 0x00eb0000. According to the table above the RVA for the .text section is 0x1000, so the final virtual address for the .text section is 0x00eb1000. The virtual address for the .data section is 0x00eba000. The table also provide us with the size of the section, i.e. the VirtualSize column.
I will not go into detail what each section contains, but I can mention that the .text section mainly contains binary code for the processor to execute. I say mainly, because I inspected this section and realized it also contains the Import Address Table (IAT). I will probably have post about this in the future.
So thanks to the table above, I was able to do this diagram in Excel.
This is what notepad.exe looks like when it is loaded into memory. Note that the diagram does not take the image base address into account. I've just made the diagram from 0 (zero) and not an actual image base address (e.g. 0x00eb0000). The headers are copied to low memory, and then follows the four sections. Note that each section is aligned to a 4 KB page. For instance, the .text section are mapped into RVA 0x1000 (DEC: 4096), which leaves a space with zero padding to headers. The last section is the .reloc section which is 0xd18 in size and starts at RVA 0x27000. The SizeOfImage is 0x28000 bytes (DEC: 163840) and this means that final bytes in the image are zero padded as well.
You are welcome to leave comments, complaints or questions!