How structure padding is works with respect to largest size member in C?

I got some confusion while understanding structure padding concept. I have understood that structure padding increases the performance of the processor at the penalty of memory. Here i have some structure defined

case 1:

typedef struct{
  double A; //8-byte
  char B;   //1-byte
  char C:   //1-byte
} Test1;

here the total size of the structure will be 16 byte (The largest size member is double. hence compiler aligned memory in the form of 8 byte.)

But in these two case

case 2:

typedef struct{
  int A;     //4-byte
  double B;  //8-byte
  float C;   //4-byte
} Test2;

Here largest size member is double (8 byte). So this will allocate 8 + 8 + 8 = 24 .

case 3:

typedef struct{
  double A;   //8-byte
  Int B;      //4-byte
  float C;    //4-byte
} Test3;

But here also largest size member is double (8 byte). So ideally this also same as 24 byte. But when I have print the value, I am getting size 16 byte. Why it is not behaving like case 2 ? Any explanations ?

Solution

How structure padding is works with respect to largest size member in C?

Padding is fundamentally determined by the alignment requirements of the members, not solely by their sizes. Each complete object type has an alignment requirement, which is some number A such that the address of the object must always be a multiple of A. Alignment requirements are always powers of two.

An object’s size is always a multiple of its alignment requirement, but the alignment requirement is not always equal to the size. For example, an eight-byte double might have four-byte alignment in some C implementations. Alignment requirements typically arise out of hardware considerations, and a system might process eight-byte objects in four-byte chunks whenever it is loading it from memory or storing it to memory, so that hardware would not care about eight-byte alignment even for eight-byte objects. A C implementation designed for that system could make the alignment requirement for an eight-byte double be just four bytes.

For your examples, we will use alignment requirements of one byte for char, four bytes for a four-byte float, and eight bytes for an eight-byte double.

In case 1:

typedef struct{
  double A; //8-byte
  char B;   //1-byte
  char C:   //1-byte
} Test1;

The structure will always start at the required alignment boundary, because the compiler will give the structure itself an alignment requirement equal to the strictest alignment requirement of any of its members. (Greater than is also allowed by the C standard, but this is not typical in practice.) Then the double A occupies eight bytes. At that point, the char B is at an allowed place, because its alignment requirement is only one byte, so any address is allowed. And char C is also okay. So far, the structure is 10 bytes long. Finally, the structure needs to have an eight-byte alignment so that it can always satisfy the alignment requirement of the double, so the structure’s total size has to be a multiple of eight bytes. (The structure’s total size must be a multiple of its alignment requirement so that in an array of them each element starts at the required alignment.) To accomplish this, we insert six bytes of padding at the end, and the total structure size is 16 bytes.

In case 2:

typedef struct{
  int A;     //4-byte
  double B;  //8-byte
  float C;   //4-byte
} Test2;

int A starts at offset four. Then double B needs to start at a multiple of eight bytes, so four bytes of padding are inserted. Now we are up to 16 bytes: Four for int A, four for padding, and eight for double B. Then float C is at an okay position. It adds four bytes, and we are up to 20 bytes. The structure size needs to be a multiple of eight bytes, so we add four bytes of padding, making 24 bytes total.

In case 3:

typedef struct{
  double A;   //8-byte
  int B;      //4-byte [Typo fixed; was "Int".]
  float C;    //4-byte
} Test3;

double A is eight bytes, and then int B adds four bytes. Now we are at 12 bytes. That is okay for float C, because its alignment requirement is four bytes, and 12 is a multiple of four. This float adds four bytes to the structure, so the size is now 16 bytes. That is okay for the structure’s alignment requirement, eight bytes, because 16 is a multiple of eight. So we do not need to add any padding, and the total structure size is 16 bytes.

Here is the method that compilers commonly use to determine padding in structures:

Each member in the structure has some size s and some alignment requirement a.
The compiler starts with a size S set to zero and an alignment requirement A set to one (byte).
The compiler processes each member in the structure in order:

Consider the member’s alignment requirement a. If S is not currently a multiple of a, then add just enough bytes to S so that it is a multiple of a. This determines where the member will go; it will go at offset S from the beginning of the structure (for the current value of S). The bytes skipped by this addition are called padding bytes.
Set A to the least common multiple¹ of A and a.
Add s to S, to set aside space for the member.

When the above process is done for each member, consider the structure’s alignment requirement A. If S is not currently a multiple of A, then add just enough to S so that it is a multiple of A.

The size of the structure is the value of S when the above is done.

Additionally:

If any member is an array, its size is the number of elements multiplied by the size of each element, and its alignment requirement is the alignment requirement of an element.
If any member is a structure, its size and alignment requirement are calculated as above.
If any member is a union, its alignment requirement is the least common multiple¹ of the alignment requirements of all its members, and its size is the size of its largest member plus just enough to make it a multiple the union’s alignment requirement.

For elementary types (int, double, et cetera), the alignment requirements are implementation-defined and are usually largely determined by the hardware. On many processors, it is faster to load and store data when it has a certain alignment (usually when its address in memory is a multiple of its size). Beyond this, the rules above follow largely from logic; they put each member where it must be to satisfy alignment requirements without using more space than necessary.

Footnote

¹ I have worded this for a general case as using the least common multiple of alignment requirements. However, since alignment requirements are always powers of two, the least common multiple of any set of alignment requirements is the largest of them.