I've recently started using libclang to parse C files. The problem I'm having is that apparently, libclang initiates the preprocessor before generating AST. I would like to prohibit the preprocessor from running, and instead be given information that preprocessor directives are in the file...
I use the following python script (cindex.py and libclang)
import codecs
from clang.cindex import *
class SourceFile(object):
def __init__(self, path):
with codecs.open(path, 'r', 'utf-8') as file:
self.file_content = file.read()
index = Index.create()
root_node = index.parse(path)
for included in root_node.get_includes():
print included.include
self.print_declerations(root_node.cursor)
def print_declerations(self, root, recurse=True):
print root.kind.name, root.spelling
if root.kind.is_declaration():
node_def = root.get_definition()
if node_def is not None:
start_offset = node_def.extent.start.offset
end_offset = node_def.extent.end.offset + 1
print self.file_content[start_offset:end_offset], '\n'
if recurse:
for child in root.get_children():
self.print_declerations(child, False)
if __name__ == '__main__':
path = 'Sample.cpp'
print 'Translation unit:', path
source = SourceFile(path)
Which outputs
Translation unit: Sample.cpp
/mingw/include\stdio.h
/mingw/include\_mingw.h
/mingw/include\sys/types.h
TRANSLATION_UNIT None
TYPEDEF_DECL __builtin_va_list
STRUCT_DECL _iobuf
TYPEDEF_DECL FILE
VAR_DECL _iob
UNEXPOSED_DECL
FUNCTION_DECL main
int main()
{
printf(HELLO_WORLD);
return 0;
}
For the following C-code:
#include <stdio.h>
#define HELLO_WORLD "HELLO!"
int main()
{
printf(HELLO_WORLD);
return 0;
}
What I would like is to get DEFINE_DECL HELLO_WORLD for my #define in the code (currently I get nothing). And of course also get similar statements for my #include's. Is this possible?
EDIT: Basically, I want to parse the file without preprocessor directives expanded.
A few days ago I have asked the same question on the #llvm freenode irc channel. The answer was "macroses is not part of AST, so you can't", but most probably "-fsyntax-only" option and clang plugin instead of libclang may help you.
Edited: Looks like now it is actually possible, see answer by bradtgmurray