Search code examples
linuxassemblynasmreverse-engineering

Disassembling and Reassembling, how to properly pipeline this in the terminal?


I'm using the eicar.com file and playing around with reverse engineering tools. I'd like to be able to disassemble and reassemble this file. I get close but there are still a few problems that I cannot figure out.

This is the original eicar.com ascii file.

X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*

Using udcli udcli -noff -nohex eicar.com > stage1.asm I end up with this x86 assembly

pop eax                 
xor eax, 0x2550214f     
inc eax                 
inc ecx                 
push eax                
pop ebx                 
xor al, 0x5c            
push eax                
pop edx                 
pop eax                 
xor eax, 0x5e502834     
sub [edi], esi          
inc ebx                 
inc ebx                 
sub [edi], esi          
jge 0x40                
inc ebp                 
dec ecx                 
inc ebx                 
inc ecx                 
push edx                
sub eax, 0x4e415453     
inc esp                 
inc ecx                 
push edx                
inc esp                 
sub eax, 0x49544e41     
push esi                
dec ecx                 
push edx                
push ebp                
push ebx                
sub eax, 0x54534554     
sub eax, 0x454c4946     
and [eax+ecx*2], esp    
sub ecx, [eax+0x2a]

Finally, putting it back together with nasm using this command, nasm stage1.asm -o stage2 I end up with...

fXf5O!P%f@fAfPf[4\fPfZfXf54(P^fg)7fCfCfg)7^O<8d>^R^@fEfIfCfAfRf-  STANfDfAfRfDf-ANTIfVfIfRfUfSf-TESTf-FILEfg!$Hfg+H*

In this case I'm starting with an ASCII file and end up with a bin file that holds a lot of extra garbage.

What am I missing here? How do I end up with the original ASCII string and have the proper file type?

EDIT: Per @Ross Ridge's suggestion, he noted that I was disassembling a 16-bit file as a 32-bit one, this has successfully cleaned up the string but he file type however is still incorrectly output as binary.

First fix: udcli -16 -noff -nohex eicar.com > stage1.asm to obtain proper output string.

Results in X5O!P%@AP[4\PZX54(P^)7CC)7^O<8d>"^@EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*

Still a little garbage data not present in the original but very close.


Solution

  • In general you can't reassemble the output of a dissembler back into the exact the same binary file as the original. There is often more than one way to assemble a given assembly instruction into machine code. As far your ultimate goal of understanding the code you're trying to do this with it's also not very helpful. Even if you do get something that you can assemble back into the original code, it's extremely unlikely you'll get something you can modify and assemble into code that works.

    To illustrate this I've provided my own "disassembly" of the eicar.com file, one that allows it to be modified to a limited extent. You can modify the string it prints, so long as the message isn't too long and does't contain any dollar sign $ characters. You should be able to modify the string while still keeping the output consisting of only of printable ASCII characters, assuming you only put printable ASCII characters in the string.

        BITS    16
        ORG     0x100
    
    ascii_shift EQU 0x097b
    
    start:
        pop     ax
        xor     ax, 0x2000 | (skip - start + 0x100) | 0x000f
        push    ax
        and     ax, 0x4000 | (skip - start + 0x100)
        push    ax
        pop     bx
        xor     al, (msg - start) ^ (skip - start)
        push    ax
        pop     dx
        pop     ax
        xor     ax, (0x2000 | (skip - start + 0x100) | 0x000f) ^ ascii_shift
        push    ax
        pop     si
        sub     [bx], si
        inc     bx
        inc     bx
        sub     [bx], si
        jnl     skip
    
    msg:
        DB      'EICAR-STANDARD-ANTIVIRUS-TEST-FILE!'
        DB      '$'
    
    %if ($ - msg) < 0x21
        TIMES   0x21 - ($ - msg) DB '$'
    %endif
    
    skip:
        DW      0x21cd + ascii_shift
        DW      0x20cd + ascii_shift
    
    %if skip - msg > 0x7e
    %error  'msg too long'
    %endif
    

    I won't explain how the code works, but I'll give you one hint: MS-DOS pushes a 16-bit 0 value on the stack at the start execution of a .COM format executable.