Search code examples
x86-64disassemblyportable-executable

Do these Windows executable meta data traits mean what I think they do?


I'm learning Assembly as part of a malware analysis project and trying to use a few Node.js libraries to scrape executables from GitHub and disassemble them.

Specifically I'm focusing on x86-64 PE.

But a disassembler, such as the one I chose isn't necessarily supposed to find the instructions in a particular executable format such as in a PE.

In addition to first needing to know where my instructions should start, when I started using the disassembler, I realized I also needed to set a particular RIP value for the program to start at. I don't fully understand why some programs start at different memory offsets, but supposedly it's to allow other cooperating processes to put memory in the same block. Or something like that.

So my goal is to know:

  • the correct starting value for the RIP
  • the correct byte to look for the first instruction, beyond the header.

So I used a library to find meta data, like so:

let metaData = await executableMetadata.getMetadataObjectFromExecutableFilePath_Async(execPath);

Which when passed an exe with a header like this:

0:      4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00
16:     b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00
32:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
48:     00 00 00 00 00 00 00 00 00 00 00 00 80 00 00 00
64:     0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68
80:     69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f
96:     74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20
112:    6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00
128:    50 45 00 00 4c 01 03 00 91 3f 9a ef 00 00 00 00
144:    00 00 00 00 e0 00 22 00 0b 01 30 00 00 12 00 00

tells us:

{
  format: 'PE',
  pe_header_offset_16le: 128,
  machine_type: 332,
  machine_type_object: {
    constant: 'IMAGE_FILE_MACHINE_I386',
    description: 'Intel 386 or later processors and compatible processors'
  },
  number_of_sections: 3,
  timestamp: -275103855,
  coff_symbol_table_offset: 0,
  coff_number_of_symbol_table_entries: 0,
  size_of_optional_header: 224,
  characteristics_bitflag: 34,
  characteristics_bitflags: [
    {
      constant: 'IMAGE_FILE_EXECUTABLE_IMAGE',
      description: 'Image only. This indicates that the image file is valid and can be run. If this flag is not set, it indicates a linker error.',
      flag_code: 2
    },
    {
      constant: 'IMAGE_FILE_LARGE_ADDRESS_AWARE',
      description: 'Application can handle > 2-GB addresses.',
      flag_code: 32
    }
  ],
  object_type_code: 267,
  object_type: 'PE32',
  linker: { major_version: 48, minor_version: 0 },
  size_of_code: 4608,
  size_of_initialized_data: 2048,
  size_of_uninitialized_data: 0,
  address_of_entry_point: 12586,
  base_of_code: 8192,
  windows_specific: {
    image_base: 4194304,
    section_alignment: 8192,
    file_alignment: 512,
    major_os_version: 4,
    minor_os_version: 0,
    major_image_version: 0,
    minor_image_version: 0,
    major_subsystem_version: 6,
    minor_subsystem_version: 0,
    win32_version: 0,
    size_of_image: 32768,
    size_of_headers: 512,
    checksum: 0,
    subsystem: {
      constant: 'IMAGE_SUBSYSTEM_WINDOWS_CUI',
      description: 'The Windows character subsystem',
      subsystem_code: 3
    },
    dll_characteristics: 34144,
    dll_characteristic_flags: [ [Object], [Object], [Object], [Object], [Object] ]
  },
  base_of_data: 16384
}

And from this, I think maybe I found the two pieces of info I needed:

  • First instruction byte: windows_specific.size_of_headers (512)
  • RIP starting value: address_of_entry_point (12586)

But I'm basically guessing. Could anyone more familiar with this meta data explain the correct properties to look at to get the info I need?


Solution

  • Windows executable file begins with 16bit DOS stub. Double word at the file offset 60 contains offset of DWORD PE signature, in your example it is 60: 80 00 00 00, i.e. 128 in decimal. PE signature is immediately followed with COFF file header (file offset 132). You may want to confront your hexadecimal dump with structure of headers in assembly language. COFF_FILE_HEADER.Machine is 132: 4C 01, i.e. 0x14C which signalizes 32bit executable. In 64bit executable it would be 0x8664.

    File header is followed by COFF section headers. You are interrested in those sections, which have set bit SCN_MEM_EXECUTE=0x2000_0000 in COFF_SECTION_HEADER.Characteristics.

    COFF_SECTION_HEADER.PointerToRawData specifies file offset of the start of code. Dissect out .SizeOfRawData bytes which start at this file offset and submit that portion of code it to your disassembler. Beware that on run-time the code will be in fact mapped to .VirtualAddress, different from .PointerToRawData.