Feb 24, 2018

Representation of data types in memory - Part 3

You have probably written statements like this thousand of times, but do you know what's going on under the hood?
float foo = 1.21;
This is the continuation of my series about representation of data types in memory. In my previous post, I've discussed the numerical integral types. Now it's time to dig into the wonderful world of floating types. There is a lot to discuss about floating types, so in this post I will focus on the Normalized form and its representation in memory.

Every C++ programmer has dealt with the data types float and double and we will soon see how they are represented in memory. But before doing that, we need to understand some basic concepts and formats, so let's discuss some theory before proceeding to the practical part.

My intention is not to give a complete description about the theory of floating points, I will give a very brief summary here (and I mean it!), the details can easily be found on the net.

Below is a very good description (from this site), which describes how a floating point is expressed when stored in memory.
A floating-point number is typically expressed in the scientific notation, with a fraction (F), and an exponent (E) of a certain radix (r), in the form of F×rE. Decimal numbers use radix of 10 (F×10E); while binary numbers use radix of 2 (F×2E).
Example:
Let's say we have 16.2510. By using the scientific notation, it can be written as 16.2510 * 100 or 1.62510 * 101 and so forth. In binary, it can be written as 10000.012 * 20, 1000.0012 * 21, 100.00012 * 2210.000012 * 23 or 1.0000012 * 24 and so forth. The representation used in a floating point is 1.0000012 * 24, which is called the Normalized form.

The IEEE standard 754 describes the single precision format and double precision format. It is important to have a brief understanding of these formats, because the floating types in C++ is based on them.

According to MSDN, following is stated (Visual Studio 2010 specific) for the data type float;
The float type is stored as a four-byte, single-precision, floating-point number. It represents a single-precision 32-bit IEEE 754 value.
According to MSDN, following is stated (Visual Studio 2010 specific) for the data type double;
The double type is stored as an eight-byte, double-precision, floating-point number. It represents a double-precision 64-bit IEEE 754 value.
Voila! There we have the basic definition of the data types float and double.

The single precision format consists of 32 bits (4 bytes), where the Most Significant Bit (MSB) represent the sign (S) bit, the following 8 bits represent the exponent (E), and the 23 Least Significant Bit(s) (LSB) represent the fraction (F). Note that a bias is applied to the exponent (E) in order to represent both positive and negative exponents. The bias is 127 in single precision format, meaning exponent (E) = 0 is represented as 127, E=1 is represented as 128 and so on.

The double precision format consists of 64 bits (8 bytes), where the MSB represent the sign (S) bit, the following 11 bits represent the exponent (E), and the 52 LSB(s) represent the fraction (F). Note that a bias is applied to the exponent (E) in order to represent both positive and negative exponents. The bias is 1023 in double precision format, meaning exponent (E) = 0 is represented as 1023, E=1 is represented as 1024 and so on.

Before proceeding to the practical part, just a few words about Normalized form.

As we have seen above, Normalized form means we have an implicit leading 1 to the left of the radix point, which is used in the fraction (F), for instance 1.0000012. This leading 1 is not represented in the 32/64 bit format, but we know the leading 1 is there, if Normalized form is used.

Let's look into a simple application. This is very similar to my previous simple application, except from the fact the data type is float.
int main()
{
   float a = 1.0;
   float b = 2.0;
   float c = 3.0;
   float d = 4.0;

   return 0;
}
When starting WinDbg and execute all the initializations statements, we see the result below.
WinDbg - Memory view after initializations
As discussed in previous posts, in the Memory view, we can see the blue rectangle which indicates the inserted block of 0xCC which is typically done in Debug mode. The blue arrow shows the offset of the Memory view, which is where the insertion of 0xCC starts. Each float is also "guarded" by four bytes 0xCC. In the Disassembly view, we can notice that special floating point instructions are used. I will not discuss them in detail here, if you want more information, they are described in Intel x86 reference manual, more specific in the section "Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2A: Instruction Set Reference, A-M"

Binary representation of 1.0

To understand how a float is stored in the memory, I will work through a couple of examples. Let's start with the first statement in the simple application;
float a = 1.0;
In binary we have 1.010 = 1.02, which is equal to 1.02 * 20 in Normalized form. Now let's see how this number 1.02 * 20 is stored in memory bit by bit.
Positive number -> Sign (S) bit = 02
Exponent: 0, i.e. the exponent (E) is represented as 12710 = 011111112
Fraction (F): 02 = 000000000000000000000002
Binary representation: 00111111 10000000 00000000 000000002
Hexadecimal representation: 3F80000016

This number is stored in the memory by this instruction;
fstp    dword ptr[ebp-8h]
The fstp instruction (Store Floating Point Value) copies the value from the FPU register stack to the memory in either single- or double precision format (single in this case due to float type). Note that the fld1 (Load Floating Point Value) instruction pushed this value (1.0) onto the FPU register stack in the first place. 0x3F800000 will be stored at memory address ebp-0x8. Taking little-endian into account, each byte will be saved according to below;

ebp-0x8: 0x00
ebp-0x7: 0x00
ebp-0x6: 0x80
ebp-0x5: 0x3F

Binary representation of 2.0
float b = 2.0;
Since 1.0 already was converted above, I will give a briefer explanation below.
Binary form: 10.02
Normalized form: 1.02 * 21
Positive number -> Sign (S) bit = 02
Exponent: 1, i.e. the exponent (E) is represented as 12810 = 100000002
Fraction (F): 02 = 000000000000000000000002
Binary representation: 01000000 00000000 00000000 000000002
Hexadecimal representation: 4000000016

This number is stored in the memory by this instruction;
fstp    dword ptr[ebp-14h]
Taking little-endian into account, each byte will be saved according to below;

ebp-0x14: 0x00
ebp-0x13: 0x00
ebp-0x12: 0x00
ebp-0x11: 0x40

Binary representation of 3.0
float a = 3.0;
Binary form: 11.02
Normalized form: 1.12 * 21
Positive number -> Sign (S) bit = 02
Exponent: 1, i.e., the exponent (E) is represented as 12810 = 100000002
Fraction (F): 12 = 100000000000000000000002
Binary representation: 01000000 01000000 00000000 000000002
Hexadecimal representation: 4040000016

This number is stored in the memory by this instruction;
fstp    dword ptr [ebp-20h]
Taking little-endian into account, each byte will be saved according to below;

ebp-0x14: 0x00
ebp-0x13: 0x00
ebp-0x12: 0x40
ebp-0x11: 0x40

Binary representation of 4.0
float a = 4.0;
Binary form: 100.02
Normalized form: 1.02 * 22
Positive number -> Sign (S) bit = 02
Exponent: 2, i.e. the exponent (E) is represented as 12910 = 100000012
Fraction (F): 02 = 000000000000000000000002
Binary representation: 01000000 10000000 00000000 000000002
Hexadecimal representation: 4080000016

This number is stored in the memory by this instruction;
fstp    dword ptr[ebp-2Ch]
Taking little-endian into account, each byte will be saved according to below;

ebp-0x14: 0x00
ebp-0x13: 0x00
ebp-0x12: 0x80
ebp-0x11: 0x40

The four examples above was only dealing with numbers in Normalized form. Later, I will have a look at the Denormalized form.

You are welcome to leave comments, complaints or questions!