Search code examples
c++cx86profilingintel-pin

add your own instructions using pin


Is it possible to add your own code in the code generated by intel-pin?

I was wondering this for a while, I created a simple tool:

#include <fstream>
#include <iostream>
#include "pin.H"

// Additional library calls go here

/*********************/

// Output file object
ofstream OutFile;

//static uint64_t counter = 0;

uint32_t lock = 0;
uint32_t unlock = 1;
std::string rtin = "";
// Make this lock if you want to print from _start
uint32_t key = unlock;

void printmaindisas(uint64_t addr, std::string disassins)
{
    std::stringstream tempstream;
    tempstream << std::hex << addr;
    std::string address = tempstream.str();
    if (key)
        return;
    if (addr > 0x700000000000)
        return;
    std::cout<<address<<"\t"<<disassins<<std::endl;
}

void mutex_lock()
{

key = !lock;
std::cout<<"out\n";

}
void mutex_unlock()
{

    key = lock;
    std::cout<<"in\n";

}

void Instruction(INS ins, VOID *v)
{
    //if
  // Insert a call to docount before every instruction, no arguments are passed
  INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)printmaindisas, IARG_ADDRINT, INS_Address(ins),
  IARG_PTR, new string(INS_Disassemble(ins)), IARG_END);
    //std::cout<<INS_Disassemble(ins)<<std::endl;
}

void Routine(RTN rtn, VOID *V)
{
    if (RTN_Name(rtn) == "main")
    {
        //std::cout<<"Loading: "<<RTN_Name(rtn) << endl;
        RTN_Open(rtn);
        RTN_InsertCall(rtn, IPOINT_BEFORE, (AFUNPTR)mutex_unlock, IARG_END);
        RTN_InsertCall(rtn, IPOINT_AFTER, (AFUNPTR)mutex_lock, IARG_END);
        RTN_Close(rtn);
    }
}

KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool", "o", "mytool.out", "specify output file name");
/*
VOID Fini(INT32 code, VOID *v)
{
    // Write to a file since cout and cerr maybe closed by the application
    OutFile.setf(ios::showbase);
    OutFile << "Count " << count << endl;
    OutFile.close();
}
*/

int32_t Usage()
{
  cerr << "This is my custom tool" << endl;
  cerr << endl << KNOB_BASE::StringKnobSummary() << endl;
  return -1;
}

int main(int argc, char * argv[])
{
  // It must be called for image instrumentation
  // Initialize the symbol table
  PIN_InitSymbols();

  // Initialize pin
  if (PIN_Init(argc, argv)) return Usage();
  // Open the output file to write
  OutFile.open(KnobOutputFile.Value().c_str());

  // Set instruction format as intel
    // Not needed because my machine is intel
  //PIN_SetSyntaxIntel();

  RTN_AddInstrumentFunction(Routine, 0);
  //IMG_AddInstrumentFunction(Image, 0);

  // Add an isntruction instrumentation
  INS_AddInstrumentFunction(Instruction, 0);

  //PIN_AddFiniFunction(Fini, 0);

  // Start the program here
  PIN_StartProgram();

  return 0;

}

If I print the following c code (which does literally nothing):

int main(void)
{}

Gives me this output:

in
400496  push rbp
400497  mov rbp, rsp
40049a  mov eax, 0x0
40049f  pop rbp
out

And with the following code:

#include <stdio.h>
int main(void)
{
  printf("%s\n", "Hello");
}

prints:

in
4004e6  push rbp
4004e7  mov rbp, rsp
4004ea  mov edi, 0x400580
4004ef  call 0x4003f0
4003f0  jmp qword ptr [rip+0x200c22]
4003f6  push 0x0
4003fb  jmp 0x4003e0
4003e0  push qword ptr [rip+0x200c22]
4003e6  jmp qword ptr [rip+0x200c24]
Hello
4004f4  mov eax, 0x0
4004f9  pop rbp
out

So, my question is, is it possible to add:

4004ea  mov edi, 0x400580
4004ef  call 0x4003f0
4003f0  jmp qword ptr [rip+0x200c22]
4003f6  push 0x0
4003fb  jmp 0x4003e0
4003e0  push qword ptr [rip+0x200c22]
4003e6  jmp qword ptr [rip+0x200c24]

instructions in my first code (code with no print function), using pin in the instrumentation routine/ or analysis routine, so that I can imitate the my second code (by dynamically adding those instructions)? (I don't want to call printf directly, but want to imitate the behavior) (in future I was thinking of imitating sanity checker or intel mpx using pin, if I could add these check instructions dynamically in some way)

I looked at pin documentation, it has the instruction modification api, but it can be only used to add direct/ indirect branches or delete instructions (but we can't add add new ones).


Solution

  • An analysis routine (or replacement routine) is really just code inserted into the application being profiled. But it appears to me that you want to modify one or more registers of the application context. By default, when an analysis routine executes, the Pin runtime saves the application context on entrance to the analysis routine and then later restores it when the routine returns. This basically allows the analysis routine to execute without any unintended changes to the application. However, Pin provides three ways to modify the application context in an analysis or replacement routine:

    • Pass the IARG_RETURN_REGS argument to the routine. The value returned from the routine is stored into the specified register of the application context. This enables you to change any single register whose size does not exceed the size of ADDRINT, which is the return value type of the routine. This is not supported in Probe mode or with the Buffering API1. However, it is the most efficient way to change a single register.
    • Pass an IARG_REG_REFERENCE argument for each register you want to modify in the routine. For each such argument, you need to add a parameter in the declaration of the routine of type PIN_REGISTER*. This is not supported in Probe mode or with the Buffering API, but it is the most efficient way to change a couple of registers and supports all registers.
    • Pass the IARG_CONTEXT argument to the routine. You need to add a parameter in the declaration of the routine of type CONTEXT*. Use the context manipulation API to change one or more registers of the application context. For example, you can change the RIP register of the application context using PIN_SetContextReg(ctxt, REG_INST_PTR, NewRipValue). In order for the context changes to take effect, PIN_ExecuteAt must be called, which resumes the execution of the application at the potentially changed RIP with the specified context. This is not supported with the Buffering API and there are restrictions in the Probe mode.

    For example, you if you want to execute mov edi, 0x400580 in the application context, you can simply store the value 0x400580 in the EDI register of the application context in your analysis routine:

    r->dword[0] = 0x400580;
    r->dword[1] = 0x0;      // See: https://stackoverflow.com/questions/11177137/why-do-x86-64-instructions-on-32-bit-registers-zero-the-upper-part-of-the-full-6
    

    where r is of type PIN_REGISTER*. Or alternatively:

    PIN_SetContextReg(ctxt, REG_EDI, 0x400580); // https://stackoverflow.com/questions/38782709/what-is-the-default-type-of-integral-literals-represented-in-hex-or-octal-in-c
    

    Later when application execution resumes, RDI will contain 0x400580.

    Note that you can change any valid memory location in your analysis routine whether it belongs to the application or your Pin tool. For example, if the RAX register of the application context contains a pointer, you can directly access the memory location at that pointer just like any other pointer.


    Footnotes:

    (1) It seems you're not using the Probe mode or the Buffering API.