I'm trying to write a programming language and being stuck at code generation phase.
After thorough consideration, I decide to use LLVM as my back-end because I don't want to deal with obscure low-level stuff (generating assembly is fine to me, but I need more knowledge on linking to accomplish my work).
One stumbling block is that my work is not on C++. It means I could not use ready LLVM classes in my code.
Could I generate LLVM IR code in the form of characters string, save it to file (or no need?) and then compile it? In the case I could, is there any other form that I can generate to help LLVM run faster?
Special thanks to any advice.
Copying the LLVM documentation
Your compiler front-end will communicate with LLVM by creating a module in the LLVM intermediate representation (IR) format. Assuming you want to write your language’s compiler in [something else than C++], there are 3 major ways to tackle generating LLVM IR from a front-end:
Call into the LLVM libraries code using your language’s FFI (foreign function interface).
- for: best tracks changes to the LLVM IR, .ll syntax, and .bc format
- for: enables running LLVM optimization passes without a emit/parse overhead
- for: adapts well to a JIT context
- against: lots of ugly glue code to write
Emit LLVM assembly from your compiler’s native language.
- for: very straightforward to get started
- against: the .ll parser is slower than the bitcode reader when interfacing to the middle end
- against: it may be harder to track changes to the IR
Emit LLVM bitcode from your compiler’s native language.
- for: can use the more-efficient bitcode reader when interfacing to the middle end
- against: you’ll have to re-engineer the LLVM IR object model and bitcode writer in your language
- against: it may be harder to track changes to the IR
The option you mention is number 2.