Jan 28, 2024

Multiple definitions in COFF

A couple of days ago, I was curious about how the linker deals with multiple definitions of a function. So I read Microsoft's online PE documentation about the COFF format and how COFF was dealing with multiple defintions. In this post, I'm going to investigate some simple object files with dumpbin from Microsoft. I'm using Visual studio 2019, C++ 14, 32 bit. The compiler flags will be stated further down in this post. So grab another cup of coffee and let's get started! 

As mentioned above, the online PE documentation is of great help when understanding the structure of the object file. It states that a section is a basic unit of code or data within a PE or COFF file. Below, we will learn that COMDAT is a special type of section, and we will further dig into this. 

First, I will have a brief discussion with an introductory example of the symbol table. Then I will present a couple of setups, where each setup is slightly different. My focus will be on the .text sections, since the code resides inside those. Below is the code for this introductory example.
int main()
{
   return 0;
}
The code resides inside main.cpp and is compiled into main.obj, using these flags.
/JMC /permissive- /ifcOutput "Debug\" /GS /analyze- /W3 /Zc:wchar_t /ZI /Gm- /Od /sdl /Fd"Debug\vc142.pdb" /Zc:inline /fp:precise /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /RTC1 /Gd /Oy- /MDd /FC /Fa"Debug\" /EHsc /nologo /Fo"Debug\" /Fp"Debug\project.pch" /diagnostics:column
I will dump the symbol table by using dumpbin, like this:
dumpbin /symbols main.obj
Dumping the symbols gives output below (truncated).
COFF SYMBOL TABLE
000 010575C9 ABS    notype       Static       | @comp.id
001 80010391 ABS    notype       Static       | @feat.00
002 00000002 ABS    notype       Static       | @vol.md
003 00000000 SECT1  notype       Static       | .drectve
    Section length   85, #relocs    0, #linenums    0, checksum        0
    Relocation CRC 00000000
006 00000000 SECT2  notype       Static       | .debug$S
    Section length  1E0, #relocs    2, #linenums    0, checksum        0
    Relocation CRC BE724FAD
009 00000000 SECT3  notype       Static       | .msvcjmc
    Section length    1, #relocs    0, #linenums    0, checksum 77073096
    Relocation CRC 00000000
00C 00000000 SECT3  notype       Static       | __D3C22540_main@cpp
00D 00000000 SECT4  notype       Static       | .text$mn
    Section length    5, #relocs    0, #linenums    0, checksum 672BE856, selection    2 (pick any)
    Relocation CRC 4A6A8444
010 00000000 SECT5  notype       Static       | .debug$S
    Section length   98, #relocs    3, #linenums    0, checksum        0, selection    5 (pick associative Section 0x4)
    Relocation CRC B9D5806A
013 00000000 SECT6  notype       Static       | .text$mn
    Section length   37, #relocs    3, #linenums    0, checksum A2398015, selection    1 (pick no duplicates)
    Relocation CRC 2E7CF6B9
016 00000000 SECT7  notype       Static       | .debug$S
    Section length  104, #relocs    9, #linenums    0, checksum        0, selection    5 (pick associative Section 0x6)
    Relocation CRC 351E60C8
019 00000000 SECT6  notype ()    External     | _main
In the truncated table above, we can see 26 entries from the symbol table, where the 26th entry is telling us that the symbol _main is external and belongs to section #6. 

From the online PE documentation, we can understand that the sections are defined inside the symbol table as well. For such entry, SECT6 (entry 0x13 above), there follows another entry (entry 0x014 above), providing information about the section. This entry is in Auxiliary Format 5 and since section #6 is a COMDAT section (see below for explanation), there is a selection type defined, in this case 1 (pick no duplicates). The selection type tells linker how to deal with multiple definitions of COMDATA sections. In this particular case, the linker only accepts one definition, if there are more, the linker issues a multiple definition error.

Further, I will dump the headers for each section, like this:
dumpbin /headers main.obj
Dumping the headers gives the output below (only section header #6 pasted).
SECTION HEADER #6
.text$mn name
       0 physical address
       0 virtual address
      37 size of raw data
     501 file pointer to raw data (00000501 to 00000537)
     538 file pointer to relocation table
       0 file pointer to line numbers
       3 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= _main
         16 byte align
         Execute Read
In the truncated output above, we can see section flags 60501020. The online PE documentation (Section Flags) can be used to understand the 60501020. It's also briefly described in the output above; the section contains executable code, the section contains COMDAT data, the section can be executed as code, the section can be read.

Now let's look into the different setups. Each setup contains a main.cpp with a main function and a variation of additional source files. The figure summarize the source files and resulting object files with its symbols. The compiler options are according to the introductory example above. In each setup, after the figure, I will present the important parts of the object files (using the commands described above) and have a brief discussion. Let's look at the first one.

Setup 1
Figure 1
void helperfunc1()
{}
00D 00000000 SECT4  notype       Static       | .text$mn
    Section length   35, #relocs    3, #linenums    0, checksum  DB372CC, selection    1 (pick no duplicates)
    Relocation CRC E4C10A68
019 00000000 SECT4  notype ()    External     | ?helperfunc1@@YAXXZ (void __cdecl helperfunc1(void))
SECTION HEADER #4
.text$mn name
       0 physical address
       0 virtual address
      35 size of raw data
     45A file pointer to raw data (0000045A to 0000048E)
     48F file pointer to relocation table
       0 file pointer to line numbers
       3 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= "void __cdecl helperfunc1(void)" (?helperfunc1@@YAXXZ)
         16 byte align
         Execute Read
From above, we see entry 0x00D and entry 0x019 in the symbol table, which defines SECT4 and the external symbol ?helperfunc1@@YAXXZ, which is the name mangled name of the function helperfunc1

Setup 2
Figure 2
static void helperfunc1()
{}
COFF SYMBOL TABLE
000 010575C9 ABS    notype       Static       | @comp.id
001 80010391 ABS    notype       Static       | @feat.00
002 00000002 ABS    notype       Static       | @vol.md
003 00000000 SECT1  notype       Static       | .drectve
    Section length   41, #relocs    0, #linenums    0, checksum        0
    Relocation CRC 00000000
006 00000000 SECT2  notype       Static       | .debug$S
    Section length  148, #relocs    2, #linenums    0, checksum        0
    Relocation CRC 3A286AB6
009 00000000 SECT3  notype       Static       | .msvcjmc
    Section length    1, #relocs    0, #linenums    0, checksum 77073096
    Relocation CRC 00000000
00C 00000000 SECT3  notype       Static       | __5291C45E_helperfuncs@cpp
00D 00000000 SECT4  notype       Static       | .debug$T
    Section length   5C, #relocs    0, #linenums    0, checksum        0
    Relocation CRC 00000000
010 00000000 SECT5  notype       Static       | .chks64
    Section length   28, #relocs    0, #linenums    0, checksum        0
    Relocation CRC 00000000
Running the dumpbin command with the symbol option gives no helperfunc1 symbol. The symbol is not present in the object file, since it is not used in the translation unit and is declared as static.

Setup 3
Figure 3
static void helperfunc1()
{}

void libfunc1()
{
   helperfunc1();
}
00D 00000000 SECT4  notype       Static       | .text$mn
    Section length   35, #relocs    3, #linenums    0, checksum  DB372CC, selection    1 (pick no duplicates)
    Relocation CRC E4C10A68
013 00000000 SECT6  notype       Static       | .text$mn
    Section length   3A, #relocs    4, #linenums    0, checksum 81ED1D3A, selection    1 (pick no duplicates)
    Relocation CRC B844FE03
01F 00000000 SECT4  notype ()    Static       | ?helperfunc1@@YAXXZ (void __cdecl helperfunc1(void))
020 00000000 SECT6  notype ()    External     | ?libfunc1@@YAXXZ (void __cdecl libfunc1(void))
SECTION HEADER #4
.text$mn name
       0 physical address
       0 virtual address
      35 size of raw data
     4AA file pointer to raw data (000004AA to 000004DE)
     4DF file pointer to relocation table
       0 file pointer to line numbers
       3 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= "void __cdecl helperfunc1(void)" (?helperfunc1@@YAXXZ)
         16 byte align
         Execute Read

SECTION HEADER #6
.text$mn name
       0 physical address
       0 virtual address
      3A size of raw data
     65B file pointer to raw data (0000065B to 00000694)
     695 file pointer to relocation table
       0 file pointer to line numbers
       4 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= "void __cdecl libfunc1(void)" (?libfunc1@@YAXXZ)
         16 byte align
         Execute Read
Now libfunc is calling helperfunc1. From above, we see entry 0x00D and entry 0x013 in the symbol table, which defines SECT4 and SECT6 respectively. We can also see the the symbol ?helperfunc1@@YAXXZ again, this time it's defined as a static symbol, since we defined helperfunc1 as static in the code. Further, we also see the external symbol ?libfunc1@@YAXXZ. Both symbols states for the linker that there is only one defintion allowed.

Setup 4
Figure 4
inline void helperfunc1()
{}
COFF SYMBOL TABLE
000 010575C9 ABS    notype       Static       | @comp.id
001 80010391 ABS    notype       Static       | @feat.00
002 00000002 ABS    notype       Static       | @vol.md
003 00000000 SECT1  notype       Static       | .drectve
    Section length   41, #relocs    0, #linenums    0, checksum        0
    Relocation CRC 00000000
006 00000000 SECT2  notype       Static       | .debug$S
    Section length  148, #relocs    2, #linenums    0, checksum        0
    Relocation CRC 3A286AB6
009 00000000 SECT3  notype       Static       | .msvcjmc
    Section length    1, #relocs    0, #linenums    0, checksum 77073096
    Relocation CRC 00000000
00C 00000000 SECT3  notype       Static       | __5291C45E_helperfuncs@cpp
00D 00000000 SECT4  notype       Static       | .debug$T
    Section length   5C, #relocs    0, #linenums    0, checksum        0
    Relocation CRC 00000000
010 00000000 SECT5  notype       Static       | .chks64
    Section length   28, #relocs    0, #linenums    0, checksum        0
    Relocation CRC 00000000
Running the dumpbin command with the symbol option gives no helperfunc1 symbol. The symbol is not present in the object file, since it is not used in the translation unit and declared inline. As a matter of fact, this symbols table looks identical with the symbol table from Setup 2.

Setup 5
Figure 5
inline void helperfunc1()
{}

void libfunc1()
{
   helperfunc1();
}
00D 00000000 SECT4  notype       Static       | .text$mn
    Section length   35, #relocs    3, #linenums    0, checksum  DB372CC, selection    2 (pick any)
    Relocation CRC E4C10A68
013 00000000 SECT6  notype       Static       | .text$mn
    Section length   3A, #relocs    4, #linenums    0, checksum 81ED1D3A, selection    1 (pick no duplicates)
    Relocation CRC B844FE03
01F 00000000 SECT4  notype ()    External     | ?helperfunc1@@YAXXZ (void __cdecl helperfunc1(void))
020 00000000 SECT6  notype ()    External     | ?libfunc1@@YAXXZ (void __cdecl libfunc1(void))
SECTION HEADER #4
.text$mn name
       0 physical address
       0 virtual address
      35 size of raw data
     4AA file pointer to raw data (000004AA to 000004DE)
     4DF file pointer to relocation table
       0 file pointer to line numbers
       3 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= "void __cdecl helperfunc1(void)" (?helperfunc1@@YAXXZ)
         16 byte align
         Execute Read
SECTION HEADER #6
.text$mn name
       0 physical address
       0 virtual address
      3A size of raw data
     65B file pointer to raw data (0000065B to 00000694)
     695 file pointer to relocation table
       0 file pointer to line numbers
       4 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= "void __cdecl libfunc1(void)" (?libfunc1@@YAXXZ)
         16 byte align
         Execute Read 
Now libfunc1 is calling helperfunc1. From above, we see entry 0x00D and entry 0x013 in the symbol table, which defines SECT4 and SECT6 respectively. We can also see the the external symbol ?helperfunc1@@YAXXZ and libfunc1@@YAXXZ. Unlike Setup 3, helperfunc1 is external and its selection type is 1, meaning any symbol can be linked.

Setup 6
Figure 6
inline void helperfunc1()
{}
#include "helperfuncs.hpp"

void libfunc1()
{
   helperfunc1();
}
00E 00000000 SECT4  notype       Static       | .text$mn
    Section length   35, #relocs    3, #linenums    0, checksum  DB372CC, selection    2 (pick any)
    Relocation CRC 16A75AA6
014 00000000 SECT6  notype       Static       | .text$mn
    Section length   3A, #relocs    4, #linenums    0, checksum 81ED1D3A, selection    1 (pick no duplicates)
    Relocation CRC B844FE03
020 00000000 SECT4  notype ()    External     | ?helperfunc1@@YAXXZ (void __cdecl helperfunc1(void))
021 00000000 SECT6  notype ()    External     | ?libfunc1@@YAXXZ (void __cdecl libfunc1(void))
SECTION HEADER #4
.text$mn name
       0 physical address
       0 virtual address
      35 size of raw data
     53B file pointer to raw data (0000053B to 0000056F)
     570 file pointer to relocation table
       0 file pointer to line numbers
       3 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= "void __cdecl helperfunc1(void)" (?helperfunc1@@YAXXZ)
         16 byte align
         Execute Read
SECTION HEADER #6
.text$mn name
       0 physical address
       0 virtual address
      3A size of raw data
     6EC file pointer to raw data (000006EC to 00000725)
     726 file pointer to relocation table
       0 file pointer to line numbers
       4 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= "void __cdecl libfunc1(void)" (?libfunc1@@YAXXZ)
         16 byte align
         Execute Read
This setup does not differ from Setup 5. The only difference is the created helperfuncs.hpp, which is included in the created libfuncs1.cpp. This is a more typical setup, where the inline function is defined inside the header file. Note that the symbol entries and sections are identical with Setup 5.

Setup 7
Figure 7
inline void helperfunc1()
{}
#include "helperfuncs.hpp"

void libfunc1()
{
   helperfunc1();
}
#include "helperfuncs.hpp"

void libfunc2()
{
   helperfunc1();
}
00E 00000000 SECT4  notype       Static       | .text$mn
    Section length   35, #relocs    3, #linenums    0, checksum  DB372CC, selection    2 (pick any)
    Relocation CRC 16A75AA6
014 00000000 SECT6  notype       Static       | .text$mn
    Section length   3A, #relocs    4, #linenums    0, checksum 81ED1D3A, selection    1 (pick no duplicates)
    Relocation CRC 4796C86E
020 00000000 SECT4  notype ()    External     | ?helperfunc1@@YAXXZ (void __cdecl helperfunc1(void))
021 00000000 SECT6  notype ()    External     | ?libfunc2@@YAXXZ (void __cdecl libfunc2(void))
SECTION HEADER #4
.text$mn name
       0 physical address
       0 virtual address
      35 size of raw data
     533 file pointer to raw data (00000533 to 00000567)
     568 file pointer to relocation table
       0 file pointer to line numbers
       3 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= "void __cdecl helperfunc1(void)" (?helperfunc1@@YAXXZ)
         16 byte align
         Execute Read
SECTION HEADER #6
.text$mn name
       0 physical address
       0 virtual address
      3A size of raw data
     6E4 file pointer to raw data (000006E4 to 0000071D)
     71E file pointer to relocation table
       0 file pointer to line numbers
       4 number of relocations
       0 number of line numbers
60501020 flags
         Code
         COMDAT; sym= "void __cdecl libfunc2(void)" (?libfunc2@@YAXXZ)
         16 byte align
         Execute Read
This is even a more typical setup. The inline function is defined inside the header file. The header file is included in two cpp files. 

Above is only the symbol table for the additional file libfuncs2.obj stated. We can compare the symbols from libfuncs1.obj (from Setup 6) with the symbols from libfuncs2.obj in this Setup. Both object files contains the external symbol?helperfunc1@@YAXXZ, with selection type 2. Thanks to this selection type, the linker will not complain about multiple defintions during link time.

You are welcome to leave comments, complaints or questions!

No comments:

Post a Comment