Search code examples
cross-compilingllvm-clangarm64raspberry-pi4

How do I cross-compile LLVM/Clang for AArch64 on x64 host?


I want to use clang-11 on my AArch64 Raspberry Pi 4, running Ubuntu 20.04 Focal. I looked at https://apt.llvm.org/, but AArch64 prebuilt binaries do not seem available?

I tried building clang on the Raspberry Pi directly, but it was very slow and I ran out of space on the SD card eventually.

How do I cross-compile clang myself, on my x64 laptop?


Solution

  • Clang now has arm64 (AArch64) prebuilt binaries for in-development versions at https://apt.llvm.org/, so use those if you can. Otherwise, read on!

    Building LLVM can be tricky in that it takes lots of computing resources, which makes it hard to iterate with different build options. My first attempt to build trunk version of clang for my AArch64 Raspberry PI ended up with a build for ARM7, and also 30GB in size, which just would not fit on the memory card. Whoops.

    Study documentation on project wiki

    The first relevant Clang documentation page is Building LLVM with CMake. It explains CMake options CMAKE_BUILD_TYPE, CMAKE_INSTALL_PREFIX, and LLVM_TARGETS_TO_BUILD.

    It is a good idea to set -DCMAKE_BUILD_TYPE=MinSizeRel, or some other value other than the default Debug. Debug build of clang will run significantly slower. Customizing CMAKE_INSTALL_PREFIX is necessary, as you do not want to install Clang onto your host system. Give it -DCMAKE_INSTALL_PREFIX=$PWD/install, then copy the install directory to your AArch64 machine.

    To decrease the installed size, set -DLLVM_TARGETS_TO_BUILD=AArch64. The default there is to build all targets.

    Enable assertions (but probably disable debug info)

    If you want to use cutting edge features, which is likely, otherwise, why compile clang, you would want to keep assertions in clang code enabled, and you would want debug symbols. Assertions don't cost us almost anything. The debug info is different: it slows down the compilation (linking) of clang binary itself, makes it run slightly slower, and makes it significantly bigger, but it can be worthwhile due to increased debuggability, allowing to submit bug report with a symbolized backtrace if anything goes wrong. Check out Getting the Source Code and Building LLVM and set -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=On in this case. Also consider -DCMAKE_BUILD_TYPE=MinSizeRel. (If you decide to go with debug info anyways, use -DCMAKE_BUILD_TYPE=RelWithDebInfo.)

    Next up, read Building a Distribution of LLVM. The relevant advice is to further minimize installed size by setting -DLLVM_BUILD_LLVM_DYLIB=On -DLLVM_LINK_LLVM_DYLIB=On -DLLVM_INSTALL_TOOLCHAIN_ONLY=On.

    Finally, read How To Cross-Compile Clang/LLVM using Clang/LLVM. This page is helpful even if you plan on using GCC to cross-compile. If you are using Ubuntu Focal, either directly or you are building in a Docker container, you will probably end up with this skeleton of your CMake command like

    CC=aarch64-linux-gnu-gcc-10 CXX=aarch64-linux-gnu-g++-10 cmake ../llvm \
      -DCMAKE_CROSSCOMPILING=True \
      -DLLVM_TARGET_ARCH=AArch64 \
      -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnueabihf \
      -DCMAKE_CXX_FLAGS='-march=armv8-a -mtune=cortex-a72' \
      -GNinja
    

    The options there should be straightforward, except maybe for LLVM_TABLEGEN and CLANG_TABLEGEN. They have to be specified, because these binaries need to run on the host, but the build compiles them for the target, so it cannot use what it just build. Existing binaries must be provided by you. Although llvm-tblgen can be installed with llvm packages, clang-tblgen is not part of the distribution. That means, you need to do two builds. First, build these two binaries for host (you do not have to build complete LLVM, these two binaries are enough), and then point the cross compilation to them.

    mkdir build-host
    cd build-host
    CC=gcc-10 CXX=g++-10 cmake ../llvm -DLLVM_ENABLE_PROJECTS='clang;compiler-rt;lld;clang-tools-extra' -GNinja
    ninja llvm-tblgen clang-tblgen
    

    Now, use these binaries in your cross build, so add to your CMake command

    -DLLVM_TABLEGEN=/usr/bin/llvm-tblgen-11 -DCLANG_TABLEGEN=/mnt/repos/llvm-project/build-host/bin/clang-tblgen
    

    Start docker

    It is advisable to mount the directory with llvm sources into the container from your filesystem. This will make it easier to ship the compilation results out, and also native filesystem is faster than the overlays in docker.

    docker run -v `pwd`:/mnt --rm -it ubuntu:focal bash
    

    Install dependencies

    On Ubuntu 20.04 Focal

    apt install g++-10-aarch64-linux-gnu libstdc++-10-dev-arm64-cross gcc-10 g++-10
    apt install cmake ninja-build python3
    

    Configure

    mkdir build-aarch64
    cd build-aarch64
    
    CC=aarch64-linux-gnu-gcc-10 CXX=aarch64-linux-gnu-g++-10 cmake ../llvm \
        -DCMAKE_BUILD_TYPE=RelWithDebInfo \
        -DLLVM_ENABLE_ASSERTIONS=On \
        -DCMAKE_CROSSCOMPILING=True \
        -DCMAKE_INSTALL_PREFIX=install \
        -DLLVM_DEFAULT_TARGET_TRIPLE=aarch64-linux-gnueabihf \
        -DLLVM_TARGET_ARCH=AArch64 \
        -DLLVM_TARGETS_TO_BUILD=AArch64 \
        -DCMAKE_CXX_FLAGS='-march=armv8-a -mtune=cortex-a72' \
        -GNinja \
        -DLLVM_ENABLE_PROJECTS='clang;compiler-rt;lld;clang-tools-extra' \
        -DLLVM_TABLEGEN=/mnt/repos/llvm-project/build-host/bin/llvm-tblgen \
        -DCLANG_TABLEGEN=/mnt/repos/llvm-project/build-host/bin/clang-tblgen \
        -DLLVM_BUILD_LLVM_DYLIB=On \
        -DLLVM_LINK_LLVM_DYLIB=On \
        -DLLVM_INSTALL_TOOLCHAIN_ONLY=On
    

    Compile

    Get a powerful build machine, if you can. Linking some of the binaries takes lots of RAM. You should have ~20 GB of memory available to be able to get anywhere in reasonable time, 64 GB would be even better. If it happens that multiple linking tasks running in parallel exhaust your machine memory, try compiling with ninja -j3 or so, to limit number of parallel tasks to, for example, 3.

    ninja install -j3
    

    Using a different linker is supposed to decrease memory requirements. Supposedly, ld.gold has lower memory requirements while linking.