Search code examples
llvm-ircrystal-lang

Crystal-lang: why is the LLVM "hello.bc" not the same if generated by Crystal or by clang?


this is my first Stackoverflow question :-)

My background:

  • 2 years Python experience
  • 2 months crystal-lang experience ( websites running with Amber framework )
  • 1 month into C, C++ , assembly

Facts: - crystal-lang is compiling and running without any problem - running on x86_64

Please be nice, as i don't have much low-level language knowledge yet.

From my understanding, when we compile and run a basic hello.c file using LLVM, it goes as follow:

hello.c :

#include
int main() {
  printf("hello world\n");
  return 0;
}

shell :

$ clang -O3 -emit-llvm hello.c -c -o hello.bc
$ llc hello.bc -o hello.s
$ gcc hello.s -o hello.native
$ ./hello.native

this comes from the LLVM examples )

My point is that we can produce a pretty short hello.bc file (128 lines) that can be run in a slower way using:

$ lli hello.bc

but when I tried to generate a similar hello.bc from a hello.cr file and run it like i did with the hello.c file:

hello.cr :

puts "hello world"

shell :

$ crystal build hello.cr --emit llvm-bc --release
$ llc hello.bc -o hello.s

what i noticed:

  • This hello.bc file is much much bigger than the one generating from the c file (43'624 lines)
  • This hello.bc can't be run using "lli" as it generates an:

    "LLVM ERROR: Program used external function 'pcre_malloc' which could not be resolved!

  • I can't even compile from hello.s to hello.native

  • Same problem if i try to use generate and hello.ll file

As i understood, LLVM is portable , and that all front-end languages would produce an intermediate *.bc that can then be compiled to any architecture.

My questions are:

  • Why are the hello.bc not similar in both cases ?
  • Am I doing something wrong in the crystal procedure ?

Thank you!


Solution

  • Everything is just as it is supposed to be. Crystal has a runtime library that is always present even if you didn't include anything. This is required to run the Crystal program.

    The C example pretty much doesn't contain anything else than a syscall to printf. That's why the compiled ASM is also really tiny.

    Crystal's simple puts call has a much more behind it. It is based on libraries for handling asynchronous IO, concurrency, signal handling, garbage collection and more. Some of these libraries are completely implemented in the Crystal standard library, some use other libraries that are either directly embedded into the binary (libgc) or still require dynamic libraries from the system (libpcre, libpthread).

    Any Crystal program comes with this runtime library by default. Even an empty program. This usually goes completely unnoticed because larger programs will eventually need those things anyway and the compiled binary size of the runtime library is less than 500 KB (in release mode). Such a small program like yours doesn't really need all of this just to print a string. But these libraries are required for the Crystal runtime.

    NOTE: You can compile a Crystal program without these default libraries. But this means you can't use anything from the Crystal stdlib and you have to essentially write C code with Crystal syntax (or implement your own stdlib):

    require "lib_c"
    require "c/stdio"
    
    LibC.printf pointerof("hello world".@c)
    

    This can be compiled with --prelude=empty option and it will generate a substantially smaller ASM, roughly similar to the C example.