Search code examples
reverse-engineeringghidra

ghidra full of thunk functions


I am trying to do a crackme in ghidra. I was already able to find the answer but I want to know how I would have done this "properly" as I used a debugger and looked at strings. In the image below you can see obviously there are a lot of thunk functions but honestly to me they just look like printf's. I don't know how to fix it so I can get actually readable function names or if there isn't a way.

I assumed it may have had something to do with an error I get when I try to analyse the file which I get an error about the PDB file. I tried to recompile the msdia140.dll because I am using visual studio 2019 but I just got build errors when I tried to.

TL;DR how do I make it so that the image below isn't full of thunk functions and is actually readable in a way as those look like printf functions.

Ghidra Thunks

What happens when I try to compile new PDA


Solution

  • I assumed it may have had something to do with an error I get when I try to analyse the file which I get an error about the PDB file

    I am assuming the error message is "Unable to locate PDB file "[...]" with matching GUID [...]". If so, this is due to the fact that you don't have the PDB file - the file containing debugging information - of the program you are currently reverse engineering. This is normal in case of a crackme I'd say.

    TL;DR how do I make it so that the image below isn't full of thunk functions and is actually readable in a way as those look like printf functions.

    well this is the actual work you need to do as a reverse engineer: start understanding what those functions do and set names (and data types).

    Apart from situations like MSIL or Java, where there is lots of metadata present in the compiled binary, Ghidra has no way to determine function names automatically. A slightly worse statement is true for variables: in many architectures there isn't even a concept of a variable in machine code.

    Ghidra has some heuristics to help you with the manual process: When it prefixes thunk_ for example, it means that the function in question is assessed to simply pass control to another destination function. Or the call to the API function system was already correctly named.

    The good news is that you already started some of the work: Based on function arguments or based on your dynamic analysis, you already guessed that thunk_FUN_00d83950 may be printf. So right-click on the function you want to rename and click "Rename Function". Also make sure to notice the the hot-key listed in the menu - you will need it a lot.

    Other functions will need additional analysis work: double click those and try to figure out, what they do. Or - as you did before and which is a very powerful technique - combine your static reverse engineering efforts with dynamic analysis.

    After correcting some function names you might also want to change their types. Right-Click it those again, select "Edit Function Signature", and make adjustments in the window that appears. In case of printf, which has variadic arguments, it may be necessary to select "Varargs" on the right.