Search code examples
assemblygccx86masmmemory-alignment

Alignment of stack variables in Assembly Languages


Are there any assembly directives to align specific stack data variables?

For example, suppose a MASM function has these local variables with initial values

LOCAL           beginStack:QWORD         ; ffffffffdeadbeef
LOCAL           myLocalA:QWORD           ; ffffffffffffffff
LOCAL           myLocalB:QWORD           ; 0000000000000000
LOCAL           myArrayA[10]:BYTE        ; AAAAAAAAAA
LOCAL           myArrayB[10]:BYTE        ; BBBBBBBBBB
LOCAL           endStack:QWORD           ; ffffffffbaadf00d

The memory stack has this layout but notice endStack is misaligned

00000048`51effb60 baadf00d000906ec  ; baadf00d
00000048`51effb68 42424242ffffffff  ; ffffffff
00000048`51effb70 4141424242424242 
00000048`51effb78 4141414141414141 
00000048`51effb80 0000000000000000 
00000048`51effb88 ffffffffffffffff 
00000048`51effb90 ffffffffdeadbeef 

To align endStack, I've tried to intermingle the local variables with an alignment pad[4]

LOCAL           beginStack:QWORD
LOCAL           myLocalA:QWORD
LOCAL           myLocalB:QWORD
LOCAL           myArrayA[10]:BYTE
LOCAL           myArrayB[10]:BYTE
LOCAL           pad[4]:BYTE
LOCAL           endStack:QWORD

which does correctly align endStack

0000005b`950ff950 ffffffffbaadf00d  ; aligned
0000005b`950ff958 42424242ffdaf38f  ; pad[4] is ffdaf38f
0000005b`950ff960 4141424242424242 
0000005b`950ff968 4141414141414141 
0000005b`950ff970 0000000000000000 
0000005b`950ff978 ffffffffffffffff 
0000005b`950ff980 ffffffffdeadbeef 

Another approach (if applicable) is to reshuffle the stack variables based upon a descending hierarchy
QWORD -> DWORD -> WORD -> BYTE

Question

GCC has this __attribute__ ((aligned (8))) to align variables but is there an equivalent method for assembly languages?

It does feel like the higher level languages such as C/C++ have a large toolbox of nice optimization tricks, but unfortunately are not ported over to lower level assembly languages.


Solution

  • The partial answer so far is to define a MASM macro aligned which inserts padding bytes to both DWORD and WORD variables to keep them 8 byte aligned in 64-bit.

    This crude macro accepts a DWORD or WORD variable then determines the number of padding bytes. To prevent duplicate symbol errors, it defines a local num which generates unique labels each time the macro is invoked. The output is the local variable itself followed by the padding: LOCAL pad??0001[4] for a DWORD or LOCAL pad??0001[6] for a WORD.

    aligned MACRO var
    LOCAL num
    IF @InStr(1,<var>,<:DWORD>) NE 0
      padBytes = 4
    ELSEIF @InStr(1,<var>,<:WORD>) NE 0
      padBytes = 6
    ENDIF
    var
    @CatStr(<LOCAL pad>,<num>,<[padBytes]:BYTE>)
    ENDM
    

    To keep this similar to other C/C++ alignments, the LOCAL's are prefixed with an aligned macro call

    main proc   
        aligned LOCAL AppleA:WORD
        aligned LOCAL AppleB:DWORD    
        aligned LOCAL AppleC:WORD
        aligned LOCAL AppleD:DWORD      
        
        LOCAL OrangeA:WORD
        LOCAL OrangeB:DWORD    
        LOCAL OrangeC:WORD
        LOCAL OrangeD:DWORD         
           
        mov AppleA,1
        mov AppleB,2
        mov AppleC,3
        mov AppleD,4   
        
        mov OrangeA,5
        mov OrangeB,6
        mov OrangeC,7
        mov OrangeD,8      
       
        ret
    main endp
    
    end
    

    The memory stack from running the code shows

    00000030`f1b1fb10 00077ff600000008   ; OrangeD 8 is misaligned
    00000030`f1b1fb18 00057ff600000006   ; OrangeB 6 is misaligned
    00000030`f1b1fb20 000000040000001f   ; AppleD 4 is aligned
    00000030`f1b1fb28 0003000000000000   ; AppleC 3 is aligned
    00000030`f1b1fb30 0000000200000000   ; AppleB 2 is aligned
    00000030`f1b1fb38 00017ff6915ae298   ; AppleA 1 is aligned
    

    Notes

    As noted in the comments, this macro does not parse arrays yet. A solution is to use the MASM MOD operator which returns the integer value of the remainder (modulo) when dividing expression1 by expression2. The idea would be to accept an array, say aligned LOCAL myArrayA[10]:BYTE, extract the arraySize of 10 then calculate the required padding bytes 6 using

    padBytes = (8 - (arraySize MOD 8))
    

    Edited

    Michael Petch suggested a very clever and unique approach by putting all the locals in a struc. A struc can take an alignment value (ie: 8 or QWORD) and it will align the elements accordingly. It is a bit of extra work but it would do the trick. An example is here: pastebin.com/4cLWm0f1

    .code
    main PROC
        main_locs STRUC QWORD           ; 8 byte (QWORD) alignment
            beginStack DQ ?
            myLocalA   DQ ?
            myLocalB   DQ ?
            myArrayA   BYTE 10 DUP (?)
            myArrayB   BYTE 10 DUP (?)
            endStack   DQ ?
        main_locs ENDS
        LOCAL stack_vars: main_locs
     
        lea rcx, stack_vars
        mov [rcx][main_locs.beginStack], 0ffffffffdeadbeefh
        mov [rcx][main_locs.myLocalA],   0ffffffffffffffffh
        mov [rcx][main_locs.myLocalB],   00000000000000000h
        ;Fill myArrayA with 10 As
        mov rax, "AAAAAAAA"
        mov QWORD PTR[rcx][main_locs.myArrayA], rax
        mov WORD  PTR[rcx][main_locs.myArrayA+8], ax
        ;Fill myArrayB with 10 Bs
        mov rax, "BBBBBBBB"
        mov QWORD PTR[rcx][main_locs.myArrayB], rax
        mov WORD  PTR[rcx][main_locs.myArrayB+8], ax
        mov [rcx][main_locs.endStack], 0ffffffffbaadf00dh
        ret
    main ENDP
     
    end