Search code examples
x86instrumentationinstructions

Why this instruction trace program of dynamorio has more output than mine?


I am trying to do instruction tracing using the instrumentation tool DynamoRIO. I found there is already a sample of instruction trace on their site: instrace_x86.c. However, I don't understand why they use so many operations in instrument_instr function. I tried to re-write this function in another simple way:

instrument_instr(void *drcontext, instrlist_t *ilist, instr_t *where)
{
    app_pc pc;
    per_thread_t *data;

    data  = drmgr_get_tls_field(drcontext, tls_index);
    pc = instr_get_app_pc(where);

    fprintf(data->logf, PIFX",%s\n",
        (ptr_uint_t)pc, decode_opcode_name(instr_get_opcode(where)));
}

I found this simple method seems to also work fine except that it has less output than the official sample.

I don't know why my approach has less log because I don't know why the official sample code is doing such trivial operations. Is anyone familiar with the DynamoRIO's API? (Especially drmgr_register_bb_instrumentation_event funciton. I don't understand why they are using the callback function like that)


Solution

  • The function instrument_instr is called when DynamoRIO transforms a basic block, not when the basic block is executed. Since basic blocks are often transformed only once but executed many times, your output differs from that of the sample tool.

    An over-simplified view of DynamoRIO’s internal workings is: DynamoRIO transforms the basic blocks of the target application before they are executed. This enables the user (you) to perform arbitrary changes and enables DynamoRIO to take back control after the a basic block has been executed. The transformed blocks are written to the so-called code cache, where they will be executed. DynamoRIO rewrites addresses carefully so that jumps and offsets still work. The purpose of the code cache is speed: basic blocks that already have been transformed need not be transformed again, so they stay in the code cache; jumps from other places of the program to the original basic blocks are replaced to jumps to the transformed basic blocks in the code cache automatically.

    In order to display a complete execution trace, you have to change the basic blocks so that they output all instructions they contain; then, automatically, you get the output whenever the basic blocks are executed. This is not trivial if you want to do it efficiently. This is why the sample tool contains a lot of code.

    I recommend to read some of the tutorial slide sets, for example the DynamoRIO tutorial at CGO Feb 2017 (PDF).