Search code examples
assemblyexeexecutableportable-executable

Executable Section Headers - Meaning and use?


By opening many executable (.exe., .msi) files in Windows using 7zip, i have noticed many different file types that are common. Those include .text, .data, .bss, .rdata, .pdata etc.. I've tried to get information about them, but i can't find out what they all mean. Here's some of them:

  • .text : Code section, contains the program’s instructions - read only -.
  • .data : Generally used for writable data with some initialized non-zero content. Thus, the data section contains information that could be changed during application execution and this section must be copied for every instance.
  • .bss : Used for writable static data initialized to zero.
  • .rdata : Const / Read-only data of any kind are stored here.
  • .edata : Export directory, descriptors & handles
  • .idata : Import directory for handles & descriptors. It is used by executable files (exe's, dll’s etc.) to designate the imported and exported functions.
  • .rsrc : Section which holds information about various other resources needed by the executable, such as the icon that is shown when looking at the executable file in explorer

There are many others, which are common and i can't find any information on. Mostly those are: .pdata, .tls, .reloc, CERTIFICATE, .rsrc_1, .aspack, .adata, .INIT, DATA, CODE, .ctors.

Also a rsrc folder is contained in most of them, which contains folders like BITMAP, CURSOR, ICON, GROUP_CURSOR, GROUP_ICON, MENU, VERSION and others.

Some executables also contain more executables inside, .html files, .txt files etc. I also opened one which contained nothing at all (at least nothing shown by opening it with 7zip)! [ I opened them all with 7zip. ]


Questions

  1. What those sections / segments i posted do? Is there a website where i can find them all?
  2. All those i looked at are PEs for Windows. Are these formats standard and apply to LINUX, UNIX etc. in a similar / same way?
  3. Why do some executables contain other executables inside, or .html, .txt and other files? How are these handled when you launch the executable? What are they supposed to do? AFAIK everything inside an executable should have only those "segments" that resemble assembly code sections.
  4. What is the use of rsrc folder? What kind of resources does it hold?

I would appreciate it, if you could post more information / links as to why are all these used (as low level as possible) and generally how the executable structure should look like, what it should contain etc.

That's about all.


EDIT

I found other common section header names. I will post their meaning here for completeness.

  • .reloc : Contains the relocation table.
  • .pdata : contains an array of function table entries for exception handling, and is pointed to by the exception table entry in the image data directory
  • *data : custom data section names
  • .init : This section holds executable instructions that contribute to the process initialization code. That is, when a program starts to run the system arranges to execute the code in this section before the main program entry point (called main in C programs).
  • .fini : This section holds executable instructions that contribute to the process termination code. That is, when a program exits normally, the system arranges to execute the code in this section.
  • .ctors : Section which preserves a list of constructors
  • .dtors : Section that holds a list of destructors

Solution

  • Section names are not relevant to the file format, the toolchain (linker typically) can pick anything it likes. The operating system does not use names to find sections it cares about back, it uses the data directory in the file header. Which contains numbers, not names. The name just serves as a mnemonic to help identify sections. Or might be used to help a language runtime or debugger find sections back that are not covered by the data directory.

    There is some consistency in section names, largely by convention. A weirdo section name like BSS goes all the way back to the 50's, used in Fortran, an acronym for Block Started by Symbol. Does not help much to guess at its use today :) And you can assume that a section named CODE will contain executable code and is equivalent to .text, the much more common name choice. Names like .tls and .reloc can be mapped to the corresponding data directory entry without much trouble.

    Same receipe for .rsrc, maps to the third entry in the data directory. Matters to the OS, a winapi function like LoadString needs it.

    However, only knowing the tool chain in detail gives you a real cue to the oddball ones.

    The operating system loader places a section directly into virtual memory through a memory-mapped file that uses the executable file as the backing store. Which is how sections like .text, .data and .bss are used, note how they don't have a corresponding entry in the data directory. The linker took care of generating the proper addresses, the way it was done 25+ years ago with no help need from the OS. Other than the .reloc section if the file could not be mapped to its preferred base address, that's old.