Search code examples
securityassemblyobfuscationchecksum

Obfuscation of checksum guards


As part of my project, I have to insert small codes in a C program called checksum guards. What these guards do is they calculate the checksum value of a portion of code using a function(add, xor, etc.) which operates on the instruction opcodes. So, if somebody has tampered with the instructions(add, modify, delete) in that region of code, the checksum value will change and intrusion will be detected.

Here is the research paper which talks about this technique:

https://www.cerias.purdue.edu/assets/pdf/bibtex_archive/2001-49.pdf

Here is the guard template:

guard:
      add ebp, -checksum
      mov eax, client_addr

for:
      cmp eax, client_end
      jg end
      mov ebx, dword[eax]
      add ebp, ebx
      add eax, 4
      jmp for
end:

Now, I have two questions.

  1. Would putting the guards in the assembly better than putting it in the source program?

  2. Assuming I put them in the assembly(at an appropriate place) what kind of obfuscation should I use to prevent the guard template to be easily visible? (Since when I have more than 1 guard, the attacker should not easily find out all the guard templates and disable all the guards together as that would leave the code with no security)

Thank you in advance.


Solution

  • From attacker's (without sources) point of view the first question doesn't matter; he's tampering with the final binary machine code, whether it was produced from .c or .s will make zero difference. So I would worry mainly how to generate the correct binary with appropriate checksums. I'm not aware of any way how to get proper checksum inside the C source. But I can imagine to have some external tool running over assembler files created by C compiler, in some post-process way - before compiling the .s files into .o. But... Keep in mind, that some calls and addresses are just relative offsets, and the binary loaded into memory is patched by the OS loader according to linker's table, to make those point to the real memory addresses. Thus the data bytes will change (opcodes will stay fixed).

    Yours guard template doesn't take that into account, and does checksum whole opcodes with data bytes as well (Some advanced guards have opcodes definitions, and checksum/encrypt/decipher only the opcodes themselves without operand bytes).

    Otherwise it's neat, that the result is damaged ebp value, ruining any C code around (*) working with stack variables. But it's still artificial test, you can simply comment out both add ebp,-checksum and add ebp,ebx making the guard harmless.

    (*) notice you have to put the guard in between some classic C code to get some real runtime problems from invalid ebp value. If you would put it at the end of subroutine, which ends with pop ebp, everything would work well.

    So to the second question:

    You definitely want more malicious ways to guard correct value, than only ebp damage. Usually the hardest (to remove) way is to make checksum value part of some calculation, eventually skewing results just slightly, so serious usage of the SW will be impossible, but it will take time to notice by the user.

    You can also use some genuine code loop to add your checksumming to it, so simply skipping whole loop will skip also valid code (but I can imagine this one only added by hand into generated assembly from C, so you will have to redo it after every new compilation of particular C source).

    Then the particular guard template can be obfuscated by any imaginable mutation (different registers used, modified order of instructions, instruction variants), try to search about viruses with mutation encoding to get some ideas.

    And I didn't read that paper, but from the Figures I would say the main point is to make those guarding areas to overlap, so patching off one of them will affect another one, which sounds to me like that extra sugar to make it somewhat functional (although this still looks like normal challenge to 8bit game crackers ;), not even "hard" level). But that also means you would need either very cunning external tool to calculate that cyclic tree of dependencies, and insert the guard templates in correct order, or do it again manually completely.

    Of course when doing manually, you have to do it after each new C compilation, so it's worth of the effort only on something very precious and expensive, or rock solid stable, where you will not produce another revision for next 10y or so... :D