Search code examples
functioncompilationcompiler-constructioncode-generation

How is it possible to compile code from code


I want to experiment with the programs that write programs in C code, and i want to use construction like following:

int main() {
    char* srcCode="int f(int x) { return x+42; }";
    int (*compiledFun)(int) = compile(srcCode);
    printf("result=%d", (*compiledFun)(123));
    return 0;
}

Desired output should be printed "result=165".

My question is about compile() function. I may try to put srcCode in a file, then invoke external compiler, like gcc, then try to read produced binary, probably fix some addresses, and so to fill the compiledFun memory. But I feel like that would be a very inefficient stub. Is there any way to compile a program from within a program, directly from memory to memory? Maybe some library or a subset that can be ripped off gcc sources, responsible for producting binary code from source text?


That may be important addition, all source code that should be compiled is a function that takes arguments and returns. It will not call any external libraries and function like printf, but only do some calculations and return.


Solution

  • Use libtcc an in-memory C compiler from TinyC.

    A complete example from here https://github.com/TinyCC/tinycc/blob/mob/tests/libtcc_test.c

    /*
     * Simple Test program for libtcc
     *
     * libtcc can be useful to use tcc as a "backend" for a code generator.
     */
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    
    #include "libtcc.h"
    
    /* this function is called by the generated code */
    int add(int a, int b)
    {
        return a + b;
    }
    
    /* this strinc is referenced by the generated code */
    const char hello[] = "Hello World!";
    
    char my_program[] =
    "#include <tcclib.h>\n" /* include the "Simple libc header for TCC" */
    "extern int add(int a, int b);\n"
    "#ifdef _WIN32\n" /* dynamically linked data needs 'dllimport' */
    " __attribute__((dllimport))\n"
    "#endif\n"
    "extern const char hello[];\n"
    "int fib(int n)\n"
    "{\n"
    "    if (n <= 2)\n"
    "        return 1;\n"
    "    else\n"
    "        return fib(n-1) + fib(n-2);\n"
    "}\n"
    "\n"
    "int foo(int n)\n"
    "{\n"
    "    printf(\"%s\\n\", hello);\n"
    "    printf(\"fib(%d) = %d\\n\", n, fib(n));\n"
    "    printf(\"add(%d, %d) = %d\\n\", n, 2 * n, add(n, 2 * n));\n"
    "    return 0;\n"
    "}\n";
    
    int main(int argc, char **argv)
    {
        TCCState *s;
        int i;
        int (*func)(int);
    
        s = tcc_new();
        if (!s) {
            fprintf(stderr, "Could not create tcc state\n");
            exit(1);
        }
    
        /* if tcclib.h and libtcc1.a are not installed, where can we find them */
        for (i = 1; i < argc; ++i) {
            char *a = argv[i];
            if (a[0] == '-') {
                if (a[1] == 'B')
                    tcc_set_lib_path(s, a+2);
                else if (a[1] == 'I')
                    tcc_add_include_path(s, a+2);
                else if (a[1] == 'L')
                    tcc_add_library_path(s, a+2);
            }
        }
    
        /* MUST BE CALLED before any compilation */
        tcc_set_output_type(s, TCC_OUTPUT_MEMORY);
    
        if (tcc_compile_string(s, my_program) == -1)
            return 1;
    
        /* as a test, we add symbols that the compiled program can use.
           You may also open a dll with tcc_add_dll() and use symbols from that */
        tcc_add_symbol(s, "add", add);
        tcc_add_symbol(s, "hello", hello);
    
        /* relocate the code */
        if (tcc_relocate(s, TCC_RELOCATE_AUTO) < 0)
            return 1;
    
        /* get entry symbol */
        func = tcc_get_symbol(s, "foo");
        if (!func)
            return 1;
    
        /* run the code */
        func(32);
    
        /* delete the state */
        tcc_delete(s);
    
        return 0;
    }