Jan 5, 2015

PE file with empty main()

If you build an empty console program, how many bytes is needed for the Portable Executable (PE)? And what's inside the PE file?

In this post I'm experimenting with an empty console program. I'm reducing the PE file, so it just contains the headers and the binary machine code. I'm using Visual Studio 2010 Express and building the PE files in Release mode. Further, I'm using PE Insider from Cerbero and PEBrowserPro from Smidgeonsoft.

Let's compile and link the following empty program below.
int main()
{
   return 0;
}

A program doing nothing. So simple as it can be. Now, let's check out the size of this program, using file properties.

File properties - size of file


Alright, 6144 bytes of binary data is needed for this empty program. Why is this size needed, and what kind of binary data is in there? Let's fire up PE Insider, and first check out the header size, and then the section table.

PE Insider - size of headers

PE Insider - section table

Above, the SizeOfHeaders present the (HEX)size of the headers, and the SizeOfRawData, present the (HEX)size for each section needed on disk.

Let's try to understand the number 6144. This is actually the sum of the size of the headers and the size of the sections. So let's sum it to verify this:

0x400+0x800+0x600+0x200+0x200+0x200 = 0x1800 (DEC: 6144)

Okay, now we understand why the size is 6144. But what's inside all this binary data?

First let's check out the .text section, i.e. the code. This section needs 0x800 bytes. However, my empty program is doing nothing!

PE Insider - .text section

Above, the .text section starts at offset 0x400 (file on disk), and there is a lot of things going on.

Again, the main function is doing nothing. But there is a lot of other code around in the .text section. It's code from the C Runtime library. For instance, the main function is not the first function called when executing the program, the first function called is the mainCRTStartup function. This is a function in the C Runtime library (and part of the PE file). How do I know this? Well, each program has an entry point, which is specified in one of the headers in the PE file. Let's check it out.

PE Insider - Entry point
 
PEBrowserPro - disassemble view of entry point

Okay, the entry point is at Relative Virtual Address (RVA) 0x12A0 (of course within the .text section). Thanks to the disassemble view from PEBrowserPro, we can see what's going on there. At this RVA, the mainCRTStartup is located.

Next test to do; let's tell Visual Studio to use another entry point (i.e. not the mainCRTStartup). We can do this in the Property Pages dialog.


Visual Studio - specifying my own entry point

After compiling and linking, let's check the file properties again.

File properties - size of file

Wow! The file size is reduced! From 6144 bytes to 3072 bytes. Let's check out the section table again.

PE Insider - section table

Comparing to the screenshots above, the .text section is reduced from 0x800 to 0x200, the .rdata section from 0x600 to 0x200, the .data section 0x200 to 0x000. The .rsrc and the .reloc section remain the same size.

So how do the .text section look like now when we specified our own entry point?

PE Insider - .text section (main entry point)


PEBrowserPro - disassemble view - .text section (main entry point)

The only thing going on in the .text section is the return 0 statement. We have managed to get rid of the C Runtime Library code.

Note that there is a lot of 0's in the .text section. This is just zeropadding, so each section can start at a multiple of 0x200.

Now let's continue with the other sections in the PE file. The purpose for the .reloc section, is to help the loader to do some relocation if the executable not is loaded at its preferred load address. If we remove the DYNAMICBASE switch, the .reloc section will be removed. This means that the executable always will be loaded at its preferred load address, and does not need any relocation information. Let's remove the DYNAMICBASE switch.


Visual Studio - Remove dynamicbase

 Next, compile and link again and check out the file properties, and the section table again.

File properties - size of file
PE Insider - section table

Voila! The file size is decreased from 3072 bytes to 3036 bytes. The .reloc section is gone.

Let's continue with the other sections. What's going on in the .rdata section and the .rsrc section?

PE Insider - .rdata section
 
PE Insider - .rsrc section

Thanks to the ASCII view, we can figure out that Visual Studio embed a manifest in .rsrc section. We can also see that there is a PDB path in the .rdata section.

If we want to get rid of this, we just go to the Property Pages and remove the manifest as well as the debug information.


Visual Studio - Removing manifest



Visual Studio - Removing debugging information

Again, compile and link and check out the file size, the section table, and the .rdata section.

File properties - size of file
PE Insider - section table
PE Insider - .rdata section, remaining data

Not much left. File size is reduced from 3036 bytes to 2048 bytes. The .rsrc section is gone. The .text section contains a couple of bytes of binary machine code and the .data section is empty. There remain some data in the .rdata section. I'm not completely sure what this data is. If you know, please tell me. However, it can be removed by saying no to Whole Program Optimization.


Visual Studio - Whole Program Optimization

Compile and link the PE file and let's have a final look at the file properties and the section table.

File properties - size of file


PE Insider - section table

Well, that's about it! The file size is reduced from 2048 bytes to 1024 bytes. The PE file now just contains the headers and the .text section with a couple of binary machine codes.

You are welcome to leave comments, complaints or questions!

No comments:

Post a Comment