Search code examples
c++pythonschemaprecompileavro

How to generate a C++ header with Apache Avro (python script)


I am interested in generating a C++ header using Apache Avro's code generation tool (i.e. the python script). According to the documentation it should be fairly easy to do, but I don't usually use python, so things look kinda strange to me.

The instructions state:

To generate the code is a two step process:

precompile < imaginary > imaginary.flat

The precompile step converts the schema into an intermediate format that is used by the code generator. This intermediate file is just a text-based representation of the schema, flattened by a depth-first-traverse of the tree structure of the schema types.

python scripts/gen-cppcode.py --input=example.flat --output=example.hh –-namespace=Math

This tells the code generator to read your flattened schema as its input, and generate a C++ header file in example.hh. The optional argument namespace will put the objects in that namespace...

My Issue (no, I can't see a doctor or use a cream for it):

I don't see anything that explains in details how to precompile. The documentation makes it seem like if I just type "precompile" in the command prompt and supply the command line arguments, then things would magically work, but precompile is not a valid Windows command. So what's the proper way to precompile on Windows? If anybody knows how to do it, then PLEASE let me know!

I also tried to run the gen-cppcode.py script, but it gets an error in line 316 (which, I suspect, may be happening because I didn't precompile the schema):

def doEnum(args):
    structDef = enumTemplate;
    typename = args[1]
    structDef = structDef.replace('$name$', typename)
    end = False
    symbols = '';
    firstsymbol = '';
    while not end:
        line = getNextLine()
        if line[0] == 'end': end = True
        elif line[0] == 'name':
            if symbols== '' :
                firstsymbol = line[1]
            else :
                symbols += ', '
            symbols += line[1]
        else: print "error" // <-- Syntax Error: invalid syntax
    structDef = structDef.replace('$enumsymbols$', symbols);
    structDef = structDef.replace('$firstsymbol$', firstsymbol);
    addStruct(typename, structDef)
    return (typename,typename)

Solution

  • About the only way I figured to do this is to:

    1. Download VirtualBox.
    2. Install Ubuntu (or another distro).
    3. Download Avro.
    4. Install cmake.
    5. Install the C++ compilers (build essential).
    6. Install boost, flex, bison (sudo apt-get install boost flex bison); btw, you will specifically need these boost libraries:
      -- regex
      -- filesystem
      -- system
      -- program_options
    7. Build Avro:

      $ tar xf avro-cpp-1.5.1.tar.gz
      $ cd avro-cpp-1.5.1
      $ cmake -G "Unix Makefiles"
      $ make -j3
      $ build/precompile file.input file.flatoutput

    You can now generate a header file (still in the terminal window of the VM):

    python scripts/gen-cppcode.py --input=example.flat --output=example.hh

    Note that even after you generate the C++ file, you will still be unable to build with it in Windows (even if you have the right dependency includes to the avro-cpp-1.5.1/api. Avro has dependencies on GNU libraries (such as sys/uio.h) and I'm not sure how to specifically resolve them yet.