Search code examples
mathassemblyx86reverse-engineering

How to bruteforce a lossy AND routine?


Im wondering whether there are any standard approaches to reversing AND routines by brute force. For example I have the following transformation:

MOV(eax, 0x5b3e0be0)  <- Here we move 0x5b3e0be0 to EDX.
MOV(edx, eax)  # Here we copy 0x5b3e0be0 to EAX as well.
SHL(edx, 0x7)  # Bitshift 0x5b3e0be0 with 0x7 which results in 0x9f05f000
AND(edx, 0x9d2c5680)  # AND 0x9f05f000 with 0x9d2c5680 which results in 0x9d045000
XOR(edx, eax)  # XOR 0x9d045000 with original value 0x5b3e0be0 which results in 0xc63a5be0

My question is how to brute force and reverse this routine (i.e. transform 0xc63a5be0 back into 0x5b3e0be0)

One idea i had (which didn't work) was this using PeachPy implementation:

#Input values
MOV(esi, 0xffffffff) < Initial value to AND with, which will be decreased by 1 in a loop.
MOV(cl, 0x1) < Initial value to SHR with which will be increased by 1 until 0x1f.
MOV(eax, 0xc63a5be0) < Target result which I'm looking to get using the below loop.
MOV(edx, 0x5b3e0be0) < Input value which will be transformed.

sub_esi = peachpy.x86_64.Label()
with loop:
    #End the loop if ESI = 0x0
    TEST(esi, esi)
    JZ(loop.end)
    #Test the routine and check if it matches end result.
    MOV(ebx, eax)
    SHR(ebx, cl)
    TEST(ebx, ebx)
    JZ(sub_esi)
    AND(ebx, esi)
    XOR(ebx, eax)
    CMP(ebx, edx)
    JZ(loop.end)
    #Add to the CL register which is used for SHR.
    #Also check if we've reached the last potential value of CL which is 0x1f
    ADD(cl, 0x1)
    CMP(cl, 0x1f)
    JNZ(loop.begin)

    #Decrement ESI by 1, reset CL and restart routine.
    peachpy.x86_64.LABEL(sub_esi)
    SUB(esi, 0x1)
    MOV(cl, 0x1)
    JMP(loop.begin)

#The ESI result here will either be 0x0 or a valid value to AND with and get the necessary result.
RETURN(esi)

Maybe an article or a book you can recommend specific to this?


Solution

  • It's not lossy, the final operation is an XOR.
    The whole routine can be modeled in C as

    #define K 0x9d2c5680
    uint32_t hash(uint32_t num)
    {
      return num ^ ( (num << 7) & K);
    }
    

    Now, if we have two bits x and y and the operation x XOR y, when y is zero the result is x.
    So given two numbers n1 and n2 and considering their XOR, the bits or n1 that pairs with a zero in n2 would make it to the result unchanged (the others will be flipped).

    So in considering num ^ ( (num << 7) & K) we can identify num with n1 and (num << 7) & K with n2.
    Since n2 is an AND, we can tell that it must have at least the same zero bits that K has.
    This means that each bit of num that corresponds to a zero bit in the constant K will make it unchanged into the result.
    Thus, by extracting those bits from the result we already have a partial inverse function:

    /*hash & ~K extracts the bits of hash that pair with a zero bit in K*/
    partial_num = hash & ~K
    

    Technically, the factor num << 7 would also introduce other zeros in the result of the AND. We know for sure that the lowest 7 bits must be zero.
    However K already has the lowest 7 bits zero, so we cannot exploit this information.
    So we will just use K here, but if its value were different you'd need to consider the AND (which, in practice, means to zero the lower 7 bits of K).

    This leaves us with 13 bits unknown (the ones corresponding to the bits that are set in K). If we forget about the AND for a moment, we would have x ^ (x << 7) meaning that

    hi = numi for i from 0 to 6 inclusive
    hi = numi ^ numi-7 for i from 7 to 31 inclusive
    (The first line is due to the fact that the lower 7 bits of the right-hand are zero)

    From this, starting from h7 and going up, we can retrive num7 as h7 ^ num0 = h7 ^ h0.
    From bit 7 onward, the equality doesn't work and we need to use numk (for the suitable k) but luckily we already have computed its value in a previous step (that's why we start from lower to higher).

    What the AND does to this is just restricting the values the index i runs in, specifically only to the bits that are set in K.

    So to fill in the thirteen remaining bits one have to do:

    part_num7 = h7 ^ part_num0
    part_num9 = h9 ^ part_num2
    part_num12 = h12 ^ part_num5
    ...
    part_num31 = h31 ^ part_num24

    Note that we exploited that fact that part_num0..6 = h0..6.

    Here's a C program that inverts the function:

    #include <stdio.h>
    #include <stdint.h>
    
    
    #define BIT(i, hash, result) ( (((result >> i) ^ (hash >> (i+7))) & 0x1) << (i+7) )
    #define K 0x9d2c5680
    
    uint32_t base_candidate(uint32_t hash)
    {
      uint32_t result = hash & ~K;
    
      result |= BIT(0, hash, result);
      result |= BIT(2, hash, result);
      result |= BIT(3, hash, result);
      result |= BIT(5, hash, result);
      result |= BIT(7, hash, result);
      result |= BIT(11, hash, result);
      result |= BIT(12, hash, result);
      result |= BIT(14, hash, result);
      result |= BIT(17, hash, result);
      result |= BIT(19, hash, result);
      result |= BIT(20, hash, result);
      result |= BIT(21, hash, result);
      result |= BIT(24, hash, result);
    
      return result;
    }
    
    uint32_t hash(uint32_t num)
    {
      return num ^ ( (num << 7) & K);
    }
    
    
    
    int main()
    {
    
      uint32_t tester = 0x5b3e0be0;
      uint32_t candidate = base_candidate(hash(tester));
    
      printf("candidate: %x, tester %x\n", candidate, tester);
      
      return 0;
    
    }