I'd like to perform source code analysis of Linux kernel, but to do that, I first need to parse it. What are my options? I'd prefer an AST usable from python, but any other language is ok too.
Apparently CIL is able to parse whole kernel, but it's not clear from the website, how to do that.
I'd recommend starting with the sparse
static analysis tool. Because sparse
was designed specifically to assist the kernel developers in performing static analysis on the kernel, you can have some level of assurance that it really ought to parse the combination of C99 and GNU extensions that are used in the kernel sources. The code I've examined looked clean and straight forward but I never tried to extend it in any fashion. The Documentation/sparse.txt
file has a very short synopsis of using sparse
on the kernel sources, if you want a very high-level overview.
Another option is GCC MELT, a tool designed to make it easier to build plugins for the gcc
compiler. Using it would require knowing enough gcc
internals to find your way around, but MELT does look far easier than coding a similar plugin directly in C.