Search code examples
structalignmentllvmllvm-c++-api

LLVM alignment of nested structs/arrays


I want to get the exact byte representation of nested struct/array datatypes. For example the following C struct:

typedef struct zTy {
    int x;
    char c[2];
    struct  { char d; } v;
} z;

It gets converted to the following LLVM IR:

%struct.zTy = type { i32, [2 x i8], %struct.anon }
%struct.anon = type { i8 }

%a = alloca %struct.zTy, align 4

From the alloca instruction it is possible to see the alignment (4 byte). But I don't know where this alignment is inserted or how alignment for nested structs is calculated. I get the total size of the struct for my target triple using getTypeAllocSize():

AllocaInst* AI;
Module &M;
Type* T = AI->getAllocatedType();
int size = M.getDataLayout()->getTypeAllocSize(T) // 8 Byte

Is there a way to determine the exact layout for arbitrary nested datatypes for my target architecture from a LLVM pass?


Solution

  • This is ABI specific, so it depends on the target. Clang will compute it in general for C/C++ as the max of the alignment of the individual members. Here the integer is the largest field, and has a default alignment constraint of 4, which is what you get.

    Clang has -fdump-record-layouts as cc1 option to help figuring out the layout of struct/class, for example here:

    $ echo "struct zTy {
        int x;
        char c[2];
        struct  { char d; } v;
    } z;" | clang -x c  -w - -Xclang -fdump-record-layouts  -c
    
    *** Dumping AST Record Layout
             0 | struct zTy::(anonymous at <stdin>:4:5)
             0 |   char d
               | [sizeof=1, align=1]
    
    *** Dumping AST Record Layout
             0 | struct zTy
             0 |   int x
             4 |   char [2] c
             6 |   struct zTy::(anonymous at <stdin>:4:5) v
             6 |     char d
               | [sizeof=8, align=4]
    

    Inside LLVM, you lose the "C" types, but if you want to inspect a struct you need to use:

    const StructLayout *getStructLayout(StructType *Ty) const;
    

    And then using the returned StructLayout, you can get the offset of each element using:

    uint64_t StructLayout::getElementOffsetInBits(unsigned Idx) const