I have got the opportunity to work at the university and to help hacking javac
from the OpenJDK. The goal is to read custom sourcecode (for "our" programming language in combination with antlr) and not to write out Java bytecode - the compiler should write out LLVM assembler code. This would be my task, however the project is so huge that i don't know where or how to start understanding what's going on there. I was told to try out debugging the code and going through it step-by-step but i would like to know whether there is any good documentation out in the wild given me a short breakthrough to understand which parts are the most important ones.
You have "myprogram.myprogrlang" into "myprogram.llvm".
I don't see the need to use / hack javac. I think you want to use the compiler tools of Java / openjdk, but I think that only makes your task more difficult, instead of helping you.
My suggestion is take antlr, learn how does it work to parse a program in your programming language, how to generate an AST, and then turn that AST data into the LLVM bytecode or assembler.
You don't need java in this case.
So:
[1] Learn ANTLR grammars /rules for your programming language
[2] Learn LLVM bytecode / assembler
[3] Learn how to turn ANTLR data output into LLVM data input