Search code examples
c++macrosc-preprocessorconstantscompile-time-constant

Evaluate all macros in a C++ header file


I have a requirement to build an automated system to parse a C++ .h file with a lot of #define statements in it and do something with the value that each #define works out to. The .h file has a lot of other junk in it besides the #define statements.

The objective is to create a key-value list, where the keys are all the keywords defined by the #define statements and the values are the evaluations of the macros which correspond to the definitions. The #defines define the keywords with a series of nested macros that ultimately resolve to compile-time integer constants. There are some that do not resolve to compile-time integer constants, and these must be skipped.

The .h file will evolve over time, so the tool cannot be a long hardcoded program which instantiates a variable to be equal to each keyword. I have no control over the contents of the .h file. The only guarantees are that it can be built with a standard C++ compiler, and that more #defines will be added but never removed. The macro formulas may change at any time.

The options I see for this are:

  1. Implement a partial (or hook into an existing) C++ compiler and intercept the value of the macros during the preprocessor step.
  2. Use regexes to dynamically build a source file which will consume all the macros currently defined, then compile and execute the source file to get the evaluated form of all the macros. Somehow (?) skip the macros which do not evaluate to compile-time integer constants. (Also, not sure if regex is expressive enough to capture all possible multi-line macro definitions)

Both of these approaches would add substantial complexity and fragility to the build process for this project which I would like to avoid. Is there a better way to evaluate all the #define macros in a C++ .h file?

Below is an example of what I am looking to parse:

#ifndef Constants_h
#define Constants_h

namespace Foo
{
#define MAKE_CONSTANT(A, B) (A | (B << 4))
#define MAGIC_NUMBER_BASE 40
#define MAGIC_NUMBER MAGIC_NUMBER_BASE + 0x2
#define MORE_MAGIC_1 345
#define MORE_MAGIC_2 65


    // Other stuff...


#define CONSTANT_1 MAKE_CONSTANT (MAGIC_NUMBER + 564, MORE_MAGIC_1 | MORE_MAGIC_2)
#define CONSTANT_2 MAKE_CONSTANT (MAGIC_NUMBER - 84, MORE_MAGIC_1 & MORE_MAGIC_2 ^ 0xA)
    // etc...

#define SKIP_CONSTANT "What?"

    // More CONSTANT_N mixed with more other stuff and constants which do
    // not resolve to compile-time integers and must be skipped


}

#endif Constants_h

What I need to get out of this is the names and evaluations of all the defines which resolve to compile-time integer constants. In this case, for the defines shown it would be

MAGIC_NUMBER_BASE 40
MAGIC_NUMBER 42
MORE_MAGIC_1 345
MORE_MAGIC_2 65
CONSTANT_1 1887
CONSTANT_2 -42

It doesn't really matter what format this output is in as long as I can work with it as a list of key-value pairs further down the pipe.


Solution

  • An approach could be to write a "program generator" that generates a program (the printDefines program) comprising statements like std::cout << "MAGIC_NUMBER" << " " << (MAGIC_NUMBER_BASE + 0x2) << std::endl;. Obviously, executing such statements will resolve the respective macros and print out their values.

    The list of macros in a header file can be obtained by g++ with an -dM -E' option. Feeding this "program generator" with such a list of #defines will generate a "printDefines.cpp" with all the requiredcout`-statements. Compiling and executing the generated printDefines program then yields the final output. It will resolve all the macros, including those that by itself use other macros.

    See the following shell script and the following program generator code that together implement this approach:

    Script printing the values of #define-statements in "someHeaderfile.h":

    #  printDefines.sh
    g++ -std=c++11 -dM -E someHeaderfile.h > defines.txt
    ./generateDefinesCpp someHeaderfile.h defines.txt > defines.cpp
    g++ -std=c++11 -o defines.o defines.cpp
    ./defines.o
    

    Code of program generator "generateDefinesCpp":

    #include <stdio.h>
    #include <string>
    #include <iostream>
    #include <fstream>
    #include <cstring>
    
    using std::cout;
    using std::endl;
    
    /*
     * Argument 1: name of the headerfile to scan
     * Argument 2: name of the cpp-file to generate
     * Note: will crash if parameters are not provided.
     */
    int main(int argc, char* argv[])
    {
        cout << "#include<iostream>" << endl;
        cout << "#include<stdio.h>" << endl;
        cout << "#include \"" << argv[1] << "\"" << endl;
        cout << "int main() {" << endl;
        std::ifstream headerFile(argv[2], std::ios::in);
        std::string buffer;
        char macroName[1000];
        int macroValuePos;
        while (getline(headerFile,buffer)) {
            const char *bufferCStr = buffer.c_str();
            if (sscanf(bufferCStr, "#define %s %n", macroName, &macroValuePos) == 1) {
                const char* macroValue = bufferCStr+macroValuePos;
                if (macroName[0] != '_' && strchr(macroName, '(') == NULL  && *macroValue) {
                    cout << "std::cout << \"" << macroName << "\" << \" \" << (" << macroValue << ") << std::endl;" << std::endl;
                }
            }
        }
        cout << "return 0; }" << endl;
    
        return 0;
    }
    

    The approach could be optimised such that the intermediate files defines.txt and defines.cpp are not necessary; For demonstration purpose, however, they are helpful. When applied to your header file, the content of defines.txt and defines.cpp will be as follows:

    defines.txt:

    #define CONSTANT_1 MAKE_CONSTANT (MAGIC_NUMBER + 564, MORE_MAGIC_1 | MORE_MAGIC_2)
    #define CONSTANT_2 MAKE_CONSTANT (MAGIC_NUMBER - 84, MORE_MAGIC_1 & MORE_MAGIC_2 ^ 0xA)
    #define Constants_h 
    #define MAGIC_NUMBER MAGIC_NUMBER_BASE + 0x2
    #define MAGIC_NUMBER_BASE 40
    #define MAKE_CONSTANT(A,B) (A | (B << 4))
    #define MORE_MAGIC_1 345
    #define MORE_MAGIC_2 65
    #define OBJC_NEW_PROPERTIES 1
    #define SKIP_CONSTANT "What?"
    #define _LP64 1
    #define __APPLE_CC__ 6000
    #define __APPLE__ 1
    #define __ATOMIC_ACQUIRE 2
    #define __ATOMIC_ACQ_REL 4
    ...
    

    defines.cpp:

    #include<iostream>
    #include<stdio.h>
    #include "someHeaderfile.h"
    int main() {
    std::cout << "CONSTANT_1" << " " << (MAKE_CONSTANT (MAGIC_NUMBER + 564, MORE_MAGIC_1 | MORE_MAGIC_2)) << std::endl;
    std::cout << "CONSTANT_2" << " " << (MAKE_CONSTANT (MAGIC_NUMBER - 84, MORE_MAGIC_1 & MORE_MAGIC_2 ^ 0xA)) << std::endl;
    std::cout << "MAGIC_NUMBER" << " " << (MAGIC_NUMBER_BASE + 0x2) << std::endl;
    std::cout << "MAGIC_NUMBER_BASE" << " " << (40) << std::endl;
    std::cout << "MORE_MAGIC_1" << " " << (345) << std::endl;
    std::cout << "MORE_MAGIC_2" << " " << (65) << std::endl;
    std::cout << "OBJC_NEW_PROPERTIES" << " " << (1) << std::endl;
    std::cout << "SKIP_CONSTANT" << " " << ("What?") << std::endl;
    return 0; }
    

    And the output of executing defines.o is then:

    CONSTANT_1 1887
    CONSTANT_2 -9
    MAGIC_NUMBER 42
    MAGIC_NUMBER_BASE 40
    MORE_MAGIC_1 345
    MORE_MAGIC_2 65
    OBJC_NEW_PROPERTIES 1
    SKIP_CONSTANT What?