I'm writing a little PE reader, so I run dumpbin alongside my test application to confirm that the values are being read correctly. Everything it working so far, except for the export table.
The file I'm testing with is a DLL. My application reads the file in as a byte array, which gets passed to my PE reader class. The values align with those output by dumpbin, including the RVA and size of the export data directory.
E000 [ 362] RVA [size] of Export Directory
The problem is, the byte array's size is only 42,496. As you can probably imagine, when my PE reader attempts to read at E000 (57,344), I get an IndexOutOfRangeException
. dumpbin, however, has no such problem and reads the export directory just fine. And yes, the entire file is indeed being read into the byte array.
How is this possible?
The PE file contains "sections", and the sections have independent base addresses. The PE is not a contiguous memory image. Each section is a contiguous memory image.
First you will have to read the section information and make memory-map of their layout. Then you will be able to align the section offsets with the file-based offsets.
As an aside, consider looking at OllyDbg, which is a freeware, open-source debugger and disassembler for Windows. It will possibly help you test your own software, and might server the very purpose you are trying to fill by "rolling your own."
Example from dumpbin /all
output:
SECTION HEADER #1 .text name BC14 virtual size 1000 virtual address (00401000 to 0040CC13) BE00 size of raw data 400 file pointer to raw data (00000400 to 0000C1FF) 0 file pointer to relocation table 0 file pointer to line numbers 0 number of relocations 0 number of line numbers 60000020 flags Code Execute Read
In this case, my .text section begins at RVA 1000 and extends to RVA CE00. The file pointer to this section is 400. I can translate-to-file-pointer any RVAs in the range 1000-CDFF by the work of subtracting 600. (All numeric values hexadecimal.)
Whenever you encounter an "RVA" (Relative Virtual Address), you resolve it to a file offset (or an index into your byte array), using this method:
Another approach that you might use is to call MapViewOfFileEx()
with the flag FILE_MAP_EXECUTE
set in dwDesiredAccess argument. This API will parse the section headers from the PE file, and read the contents of the sections into their locations relative to the "module base."
The module base is the base address at which the PE header will be loaded. When loading DLLs using LoadLibrary()
functions, this can be obtained via GetModuleInformation()
function's MODULEINFO
member lpBaseOfDll.
When using MapViewOfFileEx()
, the module base is simply the return value from MapViewOfFileEx()
.
In the setting of loading the module in these ways, resolving the RVA to a normal pointer value is a matter of:
char *
char *
char *
to the actual datatype and dereference that.A drawback of letting the OS map the file as in these approaches is that if you are using this tool to investigate some suspect file and are not sure if a developer has taken strange liberties with the section headers, it is possible you miss some valuable information by letting the OS handle this part of the parsing.