Sep 14, 2020

Encoding CRC-32

CRC, cyclic redundant check, is a method for error detection. In this post, I'm focusing on different implementations of the CRC calulation and more specific the CRC-32 calculation and its variations. Finally, I will investigate a zip file and its CRC calculation. I'm using Visual Studio 2019 Community, HxD editor and WinRar.

Before going into the implementation, I will provide a short overview of the CRC. I have no intention to explain the theory in detail, I leave that to other resources on the web. I'm focusing on CRC-32 since it's used in the zip file.

CRC is a type of cyclic block codes. Cyclic block codes are a type of linear block codes. The CRC itself is a number, typically 8- 16- or 32- bits in size. It is calculated from the message data and a polynom. Typically the CRC bits together with the message data forms a code word, which is modulated and sent over a channel.

Regarding the calculation, it can be described from a mathematical point of view using long division. It can also be described from a LFSR (Linear-feedback shift register) point of view and that is what is described here.

The LFSR can be described as a fixed number of cascaded flip-flops. Each clock tick will shift around the content (0 or 1) of the flip-flops according to the polynom. The polynom specify how many flip-flops, how many XOR gates and where they are in the circuit. We will have a look at the CRC-32 circuit in a moment.

A commonly used CRC-32 polynom, used in Ethernet, PKZIP and other applications, is defined as: 

G(x) = x³² + x²⁶ + x²³ + x²² + x¹⁶ + x¹² + x¹¹ + x¹⁰ + x⁸ + x⁷ + x⁵ + x⁴ + x² + x + 1

The polynom can be written in two forms, forward and reversed. In addition, the MSB can be omitted, since it is always 1.

Forward form:    x²⁶ + x²³ + x²² + x¹⁶ + x¹² + x¹¹ + x¹⁰ + x⁸ + x⁷ + x⁵ + x⁴ + x² + x + 1
Reversed form: 1 + x + x² + x⁴ + x⁵ + x⁷ +  x⁸ + x¹⁰ + x¹¹ + x¹² + x¹⁶ + x²² +  x²³ + x²⁶

For implementation purpose, we can translate the polynom into a binary/hexadecimal number.

Forward form:    00000100110000010001110110110111 (0x04C11DB7)
Reversed form:   11101101101110001000001100100000 (0xEDB88320)

The CRC-32 circuit can be represented as:

CRC-32 circuit

In the circuit above, there are 32 flip-flops and 14 XOR gates. For each clock tick, each flip-flop shift out its content. This means the XOR gate in the top right corner will produce a new LSB value based on the MSB and the incoming message bit.

Let's start the discussion about the CRC-32 implementation by showing an example that simulates the CRC-32 circuit.
uint32_t CalcCrc32_LeftShift(const std::vector<uint8_t>& bytes)
{
  uint32_t crcRegister = 0xFFFFFFFF;

  for (auto byte : bytes) {
    for (int j = 0; j < 8; j++) {
      uint32_t result = ((byte << 24) ^ crcRegister) & 0x80000000;
      crcRegister <<= 1;
      if (result) crcRegister ^= 0x04C11DB7;
      byte <<= 1;
    }
  }

  return ~crcRegister;
}
What's going on here?
Here is the short story. The function takes a vector of bytes (message data) and produce a CRC by doing a MSB shift (left shift) for each bit in the message data and applying the normal CRC-32 polynom when needed.

Here is the long story. The variable uint32_t crcRegister is used to represent the 32 bit LFSR. The crcRegister simulates the cascaded flip-flops, where the MSB of the crcRegister corresponds to flip-flop 31 in the CRC-32 circuit. For each byte, we loop 8 times, which simulates 8 clock ticks, i.e. 8 shifts. We have to shift 8 times since one byte is 8 bits. For each clock tick, the content of crcRegister will be updated. From the CRC-32 circuit, we can observe that we only need to apply the CRC-32 polynom if the output from the XOR gate in the top right corner is 1. I will explain in moment why. By "apply" the forward CRC-32 polynom, I mean perform each XOR operation in the CRC-32 circuit. The output from the XOR gate in the top right corner, is actually represented by the MSB in the variable uint32_t result. So let's say that the MSB in result is 1, then we know for sure that one of the operands in each XOR operations will have the value 1. The other operand in each XOR operation is the value shifted out from the flip-flop, which can be either 0 or 1. So the result of performing each XOR operation may either be 0 or 1. On the other hand if the MSB in result is 0, then we know for sure that one of the operands in each XOR operation will have the value 0. The other operand in each XOR operations is the value shifted out from the flip-flop, which can either be 0 or 1. So the result of performing each XOR operation can always be considered as the value shifted out from the flip-flop, meaning no XOR operation is neeed.

Below is a program, which is calcules the CRC-32 code (CRC-32 BZIP2) from a message data.
int main() {
  std::vector<uint8_t> messageData = {0xDE, 0xAD, 0xBE, 0xEF};
  auto crc32_BZIP2 = CalcCrc32_LeftShift(messageData);
  std::cout << std::hex << crc32_BZIP2 << std::endl;
  return 0;
}
CRC-32 BZIP2 = 0x7e25e5e7 for message data = 0xDEADBEEF

There are details in the function CalcCrc32_LeftShift that I've not discussed yet. These details are related to the name CRC-32 BZIP2, which I will desribe in a moment. Except from the polynom, there are additional parameters that can be variated to obtain slightly differerent CRC-32 codes. I will discuss them below.

The initialization value of the crcRegister is commonly set to 0x00000000 or 0xFFFFFFFF. From an LFSR point of view, the flip-flops are set to 0 or 1.

We may choose to perform a bitwise inversion of the final result. Note that this can be achieved by applying an XOR operation with 0xFFFFFFFF of the final result as well.

There are two more possible parameters. For each byte, we may (or not) reflect the input, i.e the bits in the message data. Further, we may (or not) reflect the output, i.e. the bits in final result after all iterations.

Take a look at the previous program again, which calls the function CalcCrc32_LeftShift. First we have the polynom, which is set to 0x04C11DB7. There is no reflection of the input. The initialization value is set to 0xFFFFFFFF. The MSB in the crcRegister is shifted out and processed together with the MSB in the message bytes. The output is inverted. There is no reflection of the output. All these parameters together gives the specific CRC-32 code called CRC-32 BZIP2 code. 

For further variations of the parameters and its specific CRC-32 code, I recommend the Online CRC calculator.

Let's have a look at another program where the input and output are reflected. The name of this specific CRC-32 code, is simply CRC-32. The bits in each byte must be reflected as well as the final result. In other words, the MSB in the crcRegister is shifted out and processed together with the LSB in the message bytes. Below is an implementation of reflecting data.
template <class T>
T Reflect(T in)
{
  T x = in;
  T out = 0;
  auto n = sizeof(T) * 8;
  for (auto i = 0; i < n; i++) {
    out = out << 1 | (x & 1);
    x >>= 1;
  }

  return out;
}

std::vector<uint8_t> Reflect(const std::vector<uint8_t>& data)
{
  std::vector<uint8_t> result(data.size());
  std::transform(data.begin(), data.end(), result.begin(), 
                 [](auto byte) { return Reflect(byte); });

  return result;
}

int main() {
  std::vector<uint8_t> messageData = {0xDE, 0xAD, 0xBE, 0xEF};
  auto crc32 = Reflect(CalcCrc32_LeftShift(Reflect(messageData)));
  std::cout <<  std::hex << crc32 << std::endl;
  return 0;
}
CRC-32 = 0x7c9ca35a for message data = 0xDEADBEEF

There is a more efficient way to calculate the CRC-32 above by using a right shift implementation instead. In this implemenation the reversed polynom come in handy.
uint32_t CalcCrc32_RightShift(const std::vector<uint8_t>& bytes)
{
  uint32_t crcRegister = 0xFFFFFFFF;

  for (auto byte : bytes) {
    for (int j = 0; j < 8; j++) {
      uint32_t result = (byte ^ crcRegister) & 1;
      crcRegister >>= 1;
      if (result) crcRegister ^= 0xEDB88320;
      byte >>= 1;
    }
  }

  return ~crcRegister;
}

int main() {
  std::vector<uint8_t> messageData = {0xDE, 0xAD, 0xBE, 0xEF};
  auto crc32 = CalcCrc32_RightShift(messageData);
  std::cout <<  std::hex << crc32 << std::endl;
  return 0;
}

CRC-32 = 0x7c9ca35a for message data = 0xDEADBEEF

We can understand the simularity between the reversed input/output left shift version and right shift version. In both cases, we process the MSB in the crcRegister together with LSB in the message bytes.

From above, we can conclude:
CRC-32 BZIP2 = CalcCrc32_LeftShift(messageData) = Reverse(CalcCrc32_RightShift(Reverse(messageData)) 

CRC-32 = CalcCrc32_RightShift(messageData) = Reverse(CalcCrc32_LeftShift(Reverse(messageData))

Before ending this post, I will have a quick look at a compressed file and its CRC code.

I'm using HxD to create a simple file, containing 0xDEADBEEF.

HxD - Uncompressed file

I compress it, using WinRar,  and look into the compressed file using HxD.

HxD - Compressed file

According to Wikipedia - Zip,  we can see that the CRC code is at offset 14 in the file header. This code is what we've recently calculated, i.e. the CRC-32 code.

You are welcome to leave comments, complaints or questions!

Jul 2, 2020

Representation of big- and little- endians in memory

In previous posts, I've briefly discussed endianess. In this post I will look further into this topic and provide some C++ code. I'm using Windows 10 64 bit with Microsoft Visual Community C++2019.

When I'm building programs in my IDE, I'm building in x86 debug mode. Yes, x86 means little endianess. Little endianess means LSB byte of the data is stored at lower address.

Let's look into an C++ example, how to display the addresses and values.
uint32_t data = 0xDEADBEEF;
This is the value we are going investigate how it's represented in memory. The LSB byte is EF and since we are dealing with x86, it will be stored at lower address.

First we need a function to output the content of each memory address.
void DisplayMemory(uint32_t data)
{
  auto address = reinterpret_cast<uint8_t*>(&data);
  auto n = sizeof(data);

  for (std::size_t i = 0; i < n; i++) {
    std::cout << "[" 
              << static_cast<void*>(address + i) 
              << "]: " 
              << std::setw(2) 
              << std::setfill('0')
              << std::hex
              << static_cast<uint32_t>(*(address + i))
              << std::endl;
  }
}
Let's discuss the snippet above in more detail.
auto address = reinterpret_cast<uint8_t*>(&data);
Above, we cast the address of data from uint32_t pointer to uint8_t pointer and store it in an address pointer. We have to do this cast, so we can step the address pointer byte by byte. Since those types are unrelated, we have to use the reinterpret_cast operator (and not static_cast) for this purpose. We could use (uint8_t*) for conversion as well. Let's continue with the display part of the code.
std::cout << "[" 
          << static_cast<void*>(address + i) 
          << "]: " 
          << std::setw(2) 
          << std::setfill('0')
          << std::hex
          << static_cast<uint32_t>(*(address + i))
          << std::endl;
Above, we cast address + i, i.e. the uint8_t pointer to void pointer. Why? If we pass uint8_t pointer to the operator<<, it will handle the pointer as a string (char*). Since we want to display the address as a hexadecimal address, we must use the static_cast operator and cast to void pointer for this purpose. It is sufficient with the static_cast operator, since all types of pointers can be converted to void pointers (reinterpret_cast not needed).

There is one more static_cast above. It is needed to display the content of the address pointer. Like the previously cast, we need static_cast since we don't want to display a character, but a number. As a matter of fact, integer promotion could have solved this as well.
+(*(address + i))
The for-loop call std::cout for each address pointer and display the associated byte value.

Note that we could have used printf as well in this example.
printf("[%x]: %.2x\n", address + i, *(address + i));
Let's find out how the output from this code below, look like.
DisplayMemory(data);
 
0xDEADBEEF - Little endianess

Above, we see the little endianess representation of 0xDEADBEEF. LSB byte ef is stored at lower address.

Now, let's continue with big endianess. Big endianess means MSB byte of the data is stored at lower address. In our example, the MSB byte is DE, it will be stored at lower address, which we will se below.

First we need a function to convert from little endianess to big endianess.
uint32_t SwapEndians(uint32_t data)
{
   return (data >> 24) | 
           ((data & 0x00FF0000) >> 8) |
           ((data & 0x0000FF00) << 8) | 
           (data << 24);
}
The implementation above uses shift and binary manipulations. Another implementation would be to copy the bytes in wanted order.
uint32_t SwapEndians(uint32_t data)
{
  uint32_t result;
  uint8_t *resultPtr = reinterpret_cast(&result);
  uint8_t *dataPtr = reinterpret_cast(&data);
  resultPtr[0] = dataPtr[3];
  resultPtr[1] = dataPtr[2];
  resultPtr[2] = dataPtr[1];
  resultPtr[3] = dataPtr[0];
  
  return result;
}
Now, let's use this function to display the big endian representation of 0xDEADBEEF.
uint32_t data = 0xDEADBEEF;
auto b = SwapEndians(data);
DisplayMemory(b);

0xDEADBEEF - Big endianess

Above, we see the big endianess representation of 0xDEADBEEF. MSB byte be is stored at lower address.

You are welcome to leave comments, complaints or questions!