Search code examples
assemblyx86gdbnasmshellcode

Why Segment fault when writing to writeable .data section? Using Ubuntu, x86, nasm, gdb, readelf


I'm learning to write a simple shell code using assembly. I get a Segment fault when the mov opcode executes to write over the db data. Why? Any guidance appreciated! Debugging with gdb confirms the data is contiguous with the code at run time and readelf analysis of the program confirms the data segment is writeable.

    section .text
    global _start
        _start:

          ; The following code calls execve("/bin/sh", argv, envp=0)
          jmp short two
    one:
          pop ebx
          xor eax, eax
          mov [ebx+12], eax
          mov [ebx+7], al
          mov [ebx+8], ebx
          lea ecx, [ebx+8]
          lea edx, [ebx+12]
          mov al, 11
          int 0x80
    two:
          call one
section .data align=1
          db '/bin/shzargvenvp'

Additional info after reading comments:

It Segment faults when run stand-alone on the linux command line (./myshdb), and also when I step into the mov instruction using gdb (set break at "one", run, then step repeatedly).

Yes compiling on and running on a 32bit Ubuntu installation. Here are the various command lines I'm using (which all work fine for a variant shell code effort):

nasm -f elf32 -g -F stabs myshdb.s -o myshdb.o
objdump -Mintel --disassemble myshdb.o
ld myshdb.o -o myshdb
readelf -a myshdb
gdb myshdb

Using different algorithms, the compile and commands all work fine and the program runs fine. It's something about the proximity of data immediately after code and trying to write into the data section that is giving me trouble. Initially it was all .text section, but that clearly is read-only, so I thought a data declaration aligned on 1-byte boundaries would work. The 1-byte boundary works, but somehow the write isn't working even though readelf says it's loaded writeable. Notice the 16 bytes (size=0x10) in the data segment with the "W" flag.

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        08048080 000080 000025 00  AX  0   0 16
  [ 2] .data             PROGBITS        080490a5 0000a5 000010 00  WA  0   0  1
  [ 3] .stab             PROGBITS        00000000 0000b8 0000d8 0c      4   0  4
  [ 4] .stabstr          STRTAB          00000000 000190 00000a 00      0   0  1
  [ 5] .shstrtab         STRTAB          00000000 00029b 000036 00      0   0  1
  [ 6] .symtab           SYMTAB          00000000 00019c 0000d0 10      7   9  4
  [ 7] .strtab           STRTAB          00000000 00026c 00002f 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

Does data immediately follow code? gdb output below, halted just before executing the first mov opcode. Data appears contiguous after code. EBX contains address 0x80480a5, which points to valid string data immediately after code. Examining memory (x 0x80480a5) also confirms a contiguous location.

[----------------------------------registers-----------------------------------]
EAX: 0x0 
EBX: 0x80480a5 ("/bin/shZargvenvp")
ECX: 0x0 
EDX: 0x0 
ESI: 0x0 
EDI: 0x0 
EBP: 0x0 
ESP: 0xbfffeda0 --> 0x1 
EIP: 0x804808d (<one+3>:    mov    DWORD PTR [ebx+0xc],eax)
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x8048088 <zero>:    jmp    0x80480a0 <two>
   0x804808a <one>: pop    ebx
   0x804808b <one+1>:   xor    eax,eax
=> 0x804808d <one+3>:   mov    DWORD PTR [ebx+0xc],eax
   0x8048090 <one+6>:   mov    DWORD PTR [ebx+0x8],ebx
   0x8048093 <one+9>:   mov    BYTE PTR [ebx+0x7],al
   0x8048096 <one+12>:  lea    ecx,[ebx+0x8]
   0x8048099 <one+15>:  lea    edx,[ebx+0xc]
   [------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x0804808d in one ()
gdb-peda$ x 0x80480a5
0x80480a5:  "/bin/shZargvenvp"

@Employed Russian asked for printout from reaelf -Wl. Here is the information when I rebuilt things from scratch:

---------- code snippet compiled with nasm, ld -----------------
zero: jmp short two
one:  pop ebx
      xor eax, eax
      mov [ebx+12], eax
      mov [ebx+8], ebx
      mov [ebx+7], al
      lea ecx, [ebx+8]
      lea edx, [ebx+12]
      mov al, 11
      int 0x80
two:  call one
section .data align=1
msg:   db '/bin/sh0argvenvp' 

-------- readelf output as requested --------
readelf -Wl myshdb

Elf file type is EXEC (Executable file)
Entry point 0x8048080
There are 2 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x08048000 0x08048000 0x0009d 0x0009d R E 0x1000
  LOAD           0x00009d 0x0804909d 0x0804909d 0x00010 0x00010 RW  0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .text 
   01     .data 

-------------- run with gdb and step to mov instructions ----------
---------------registers--------------
EAX: 0x0 
EBX: 0x804809d ("/bin/sh0argvenvp")

----------- memory address checks ------------
gdb-peda$ p zero
$15 = {<text variable, no debug info>} 0x8048080 <zero>
gdb-peda$ p one
$16 = {<text variable, no debug info>} 0x8048082 <one>
gdb-peda$ p two
$17 = {<text variable, no debug info>} 0x8048098 <two>
gdb-peda$ p $ebx
$18 = 0x804809d
gdb-peda$ p msg
$19 = 0x6e69622f
gdb-peda$ x 0x804809d
0x804809d:  "/bin/sh0argvenvp"
gdb-peda$ x msg
0x6e69622f: <error: Cannot access memory at address 0x6e69622f>

In other words, the string message is available from a memory location directly after code (0x804809d). Yet msg label maps to 0x6e69622f, how can I see the data there with gdb? What does reaelf -Wl output tell me?


Solution

  • Debugging with gdb confirms the data is contiguous with the code at run time and readelf analysis of the program confirms the data segment is writeable.

    You are expecting db '...' to immediately follow CALL one.

    That does not actually happen, your .data section is in a different segment (because it needs different permissions):

    readelf -Wl myshdb
    Program Headers:
      Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
      LOAD           0x000000 0x08048000 0x08048000 0x00094 0x00094 R   0x1000
      LOAD           0x001000 0x08049000 0x08049000 0x0001d 0x0001d R E 0x1000
      LOAD           0x002000 0x0804a000 0x0804a000 0x00010 0x00010 RW  0x1000
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .text
       02     .data
    

    Note that .data is in the second LOAD segment, and that segment begins on a different page.

    What may be confusing you is that your linker may leave a copy of .data following code for two (my version doesn't -- it's all 0s for me).

    In any case, your code as is tries to write to the first LOAD segment, to location immediately after the end of two, but that segment is (clearly) not writable.