LD: ALIGN vs SUBALIGN in linker scripts

How do they differ?

I read that SUBALIGN() somehow forces a certain alignment. Are there other differences?

When should I use ALIGN() and when should I use SUBALIGN()?

Solution

SUBALIGN is specifically for adjusting the alignment of the input sections within an output section. To illustrate:

$ cat one.c
char a_one __attribute__((section(".mysection"))) = 0;
char b_one __attribute__((section(".mysection"))) = 0;

$ cat two.c
char a_two __attribute__((section(".mysection"))) = 0;
char b_two __attribute__((section(".mysection"))) = 0;

$ gcc -c one.c two.c

Case 1

$ cat foo_1.lds
SECTIONS
{
    . = 0x10004;
    .mysection ALIGN(8) : {
        *(.mysection)
    }
}

$ ld -T foo_1.lds one.o two.o -o foo1.out
$ readelf -s foo1.out

Symbol table '.symtab' contains 9 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000010008     0 SECTION LOCAL  DEFAULT    1 
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    2 
     3: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS one.c
     4: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS two.c
     5: 000000000001000b     1 OBJECT  GLOBAL DEFAULT    1 b_two
     6: 0000000000010008     1 OBJECT  GLOBAL DEFAULT    1 a_one
     7: 0000000000010009     1 OBJECT  GLOBAL DEFAULT    1 b_one
     8: 000000000001000a     1 OBJECT  GLOBAL DEFAULT    1 a_two

$ readelf -t foo1.out | grep -A3 mysection
  [ 1] .mysection
       PROGBITS               PROGBITS         0000000000010008  0000000000010008  0
       0000000000000004 0000000000000000  0                 1
       [0000000000000003]: WRITE, ALLOC

Here, ALIGN(8) aligns .mysection to the next 8-byte boundary, 0x10008, after 0x10004.

The char symbol a_one, coming from input section one.o(.mysection), is at the start of .mysection followed at the next byte by b_two, also coming from input section one.o(.mysection). At the next byte, is a_two, from input section two.o(.mysection), then b_two, also from two.o(.mysection). All 4 objects from all input sections *(.mysection) are just placed end to end from the start of output section .mysection.

Case 2

$ cat foo_2.lds
SECTIONS
{
    . = 0x10004;
    .mysection ALIGN(8) : SUBALIGN(16) {
        *(.mysection)
    }
}

$ ld -T foo_2.lds one.o two.o -o foo2.out
$ readelf -s foo2.out

Symbol table '.symtab' contains 9 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000010008     0 SECTION LOCAL  DEFAULT    1 
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    2 
     3: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS one.c
     4: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS two.c
     5: 0000000000010021     1 OBJECT  GLOBAL DEFAULT    1 b_two
     6: 0000000000010010     1 OBJECT  GLOBAL DEFAULT    1 a_one
     7: 0000000000010011     1 OBJECT  GLOBAL DEFAULT    1 b_one
     8: 0000000000010020     1 OBJECT  GLOBAL DEFAULT    1 a_two

$ readelf -t foo2.out | grep -A3 mysection
  [ 1] .mysection
       PROGBITS               PROGBITS         0000000000010008  0000000000010008  0
       000000000000001a 0000000000000000  0                 16
       [0000000000000003]: WRITE, ALLOC

This time, the 8-byte aligned address of .mysection is unchanged. But the effect of SUBALIGN(16) is that symbol a_one, coming from input section one.o(.mysection) is placed at the next 16-byte boundary, 0x10010, after the start of .mysection, and symbol b_one, coming from the same input section is at the next byte. But symbol a_two, coming from input section two.o(.mysection) is at the next 16-byte boundary, 0x10020; and b_two, coming also from two.o(.mysection), is 1 byte after that.