Context
The research group I work with develops a verified LLVM* interpreter. We are currently working on adding support for Rust-generated LLVM.
Compiling a simple hello world program with rustc --emit=llvm-ir
produces valid LLVM code; however, the code naturally contains references to functions from the STL, marked as hidden
within the generated file.
The need
Because we have an interpreter, we need a single, fully interpretable .ll
file, or a set of .ll
files that contain define all referenced functions, which we can link within our system.
It seems that because Rust relies heavily on cargo
and/or rustc
for linking with the STL, generating executables from LLVM produced with --emit=llvm-ir
is not a simple and natively supported feature. I would like to know if an elegant solution for this exists.
Current solution, request for improvement
One post proposes a solution for building with dependencies by compiling the STL to LLVM BC and linking with llvm-link
. We could then use llvm-dis
to get a .ll
file. This, however, can lead to undefined references due to how rustc
interacts with the LLVM assembler:
Yay, it worked! Well, except for the calls to undefined functions in there that still managed to slip through.
__rust_alloc
,__rust_dealloc
,__rust_realloc
, and__rust_alloc_zeroed
are magic functions that are defined if you use Rust's LLVM fork. The standard library also depends onlibpthread
anddlsym
which are language-asnostic libraries/functions that are usually implemented in C. You can useclang
and alibc
implementation that supports being compiled with Clang (GNU libc doesn't, I think musl might work here?) to get that if needed. Also if you are compiling to an executable it has trouble findingmain
from_start
.
We have run into several problems of this nature while replicating this post. Hence, we are seeking a better solution, if one exists. Thanks in advance.
Edit: Posted own answer with build script.
*we interpret a close subset of LLVM.
Found own answer:
By tweaking with LTO within the build scripts in the aforementioned post (key was -Clto
flag), we can now compile a .rs
file to produce a single .ll
file and an executable from it. Here's a compile script (slightly more legible than the make
we currently use.)
Extensionality for programs with dependencies is being produced and will be posted here if I remember to.
#!/bin/bash
set -x
OUTPUT_DIR=`pwd`
LLVM_HOME=/opt/homebrew/Cellar/llvm/19.1.7
RUSTUP_TOOLCHAIN_LIB=/Users/omitted/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/aarch64-apple-darwin/lib
# Build main with temporary files preserved and emit LLVM-IR
# key for in-house linking was Clto.
rustc -Clto --emit=llvm-ir main.rs
# Optimizer
$LLVM_HOME/bin/opt -o main.ll main.ll
# .ll -> .o
$LLVM_HOME/bin/llc -filetype=obj main.ll
# Complete the linking to executable.
# Extra flags for removing C++ default libs, but link System, resolv, libc, and math.
# Also strip dead code so we don't have tons of rust std library code that isn't referenced.
$LLVM_HOME/bin/clang -m64 -Wl,-dead_strip -nodefaultlibs -lSystem -lresolv -lc main.o -o main