Search code examples
x86reverse-engineeringdisassemblydecompiling

Given an instruction address, can the starting address of the function enclosing it be determined?


I've run into this problem in my current project, which requires reasoning about code at the binary level.

I think we can determine the starting location of all functions in a program by looking at the operand to CALL instructions. After we have this list, can we determine which function encloses an address by simply searching backward until we find a start address? IE is the start address of the function enclosing an instruction the greatest function address that is less than the instruction address?

If the above method is not correct, is there another way to find the starting address of the function enclosing an instruction?

edit: Added clarification of the question.

edit2: My method is probably wrong. Compilers are not guaranteed to place function bodies in contiguous regions of machine code.


Solution

  • You need to constrain your problem space more. Even when constrained just to "the output of a compiled language", compilers nowadays are good at blurring the boundaries between functions. Inlining means one function can be enclosed within another. Tail-call optimization transfers control between two functions without a CALL instruction. Profile-guided optimization can create discontiguous functions. Code flow analysis and noreturn hints can result in code falling through to data. Jump tables mean that data can fall through to code without a CALL target. The only reliable way is to have the compiler explicitly tell you the instruction-to-function mapping, say via debug information. You didn't say what platform you're using, so it's hard to give more specific information.