Search code examples
linuxglibc

How does dlopen() create the read-only VMA?


I researched how dlopen() loads a dynamic library in memory under Linux. But I can't find how or where the glibc library creates the read-only area in memory.

Glibc's dlopen() uses the program header to find the segments of type LOAD and maps these into memory. For a dynamic library this are only the first two:

Program Headers:
Type           Offset             VirtAddr           PhysAddr           FileSiz            MemSiz              Flags  Align
LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000 0x000000000000075c 0x000000000000075c  R E    0x200000
LOAD           0x0000000000000e00 0x0000000000200e00 0x0000000000200e00 0x0000000000000228 0x0000000000000230  RW     0x200000
...

The protection bits (forelast column) are for the first read/execute and for the second read/write. The corresponding sections are:

 Section to Segment mapping:
  Segment Sections...
   00     .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
   01     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss

The memory layout of a randomly chosen dynamic library from the init process is as follows:

7f4b67833000-7f4b67837000 r-xp 000000 08:01 393388 /lib/x86_64-linux-gnu/libcap.so.2.25
7f4b67837000-7f4b67a37000 ---p 004000 08:01 393388 /lib/x86_64-linux-gnu/libcap.so.2.25
7f4b67a37000-7f4b67a38000 r--p 004000 08:01 393388 /lib/x86_64-linux-gnu/libcap.so.2.25
7f4b67a38000-7f4b67a39000 rw-p 005000 08:01 393388 /lib/x86_64-linux-gnu/libcap.so.2.25

In this case, there is one additional memory area, that has only protection bit for read set. What is the reason for that and where is this done? And why is the .rodata section contained in the first segment where data is executable?

The loading part is done in elf/dl-load.c:

The interesting function is _dl_map_segments which is called on line 1181 from the _dl_map_object_from_fd function. This function is defined in the file elf/dl-map-segments.h.

But this function only maps the segments with their protection bits. Am I missing something?


Solution

  • The read-only area in the middle is created using mprotect in response to the PT_GNU_RELRO program header. This is suggested by this eu-readelf output:

    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x001698 0x001698 R   0x1000
      LOAD           0x002000 0x0000000000002000 0x0000000000002000 0x001b01 0x001b01 R E 0x1000
      LOAD           0x004000 0x0000000000004000 0x0000000000004000 0x000bdc 0x000bdc R   0x1000
      LOAD           0x005950 0x0000000000006950 0x0000000000006950 0x000800 0x000808 RW  0x1000
      DYNAMIC        0x005cf0 0x0000000000006cf0 0x0000000000006cf0 0x0001f0 0x0001f0 RW  0x8
      NOTE           0x000238 0x0000000000000238 0x0000000000000238 0x000024 0x000024 R   0x4
      GNU_EH_FRAME   0x0045b4 0x00000000000045b4 0x00000000000045b4 0x00010c 0x00010c R   0x4
      GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
      GNU_RELRO      0x005950 0x0000000000006950 0x0000000000006950 0x0006b0 0x0006b0 R   0x1
    
     Section to Segment mapping:
      Segment Sections...
       00      [RO: .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt]
       01      [RO: .init .plt .plt.got .text .fini]
       02      [RO: .rodata .eh_frame_hdr .eh_frame]
       03      [RELRO: .init_array .fini_array .data.rel.ro .dynamic .got] .data .bss
       04      [RELRO: .dynamic]
       05      [RO: .note.gnu.build-id]
       06      [RO: .eh_frame_hdr]
       07     
       08      [RELRO: .init_array .fini_array .data.rel.ro .dynamic .got]
    

    (I assume your example shared object is from Debian 10 or some similar distribution.)

    PT_GNU_RELRO is parsed in elf/dl-load.c along with the other program headers. The read-only setting itself is applied in elf/dl-reloc.c, function _dl_protect_relro, after relocation is complete. RELRO stands for relocation (and then) read-only.

    There is no separate PT_LOAD segment for the read-only portion because originally, there was a desire to limit the number of loaded segments for performance reasons, but that this doesn't work so well anymore due to competing requirements.