Search code examples
cdebugginggdb

Debugging - how to dump a dynamically allocated array from an optimized C program?


I have a C program with a dynamically allocated array whose base pointer is a global variable. An abbreviated example program is below:

double *array = NULL;

void foo() { /* complex calculations on array */ }
void bar() { /* complex calculations on array */ }

int main() {
  array = malloc( /* large size */ );
  foo();
  bar();
}

I must necessarily compile this with gcc -O2 -no-pie program.c -lm. I can disassemble the program with objdump -D -S a.out to find the location of array and various specific instructions. I want to write a script that does all of the following without my intervention:

  1. Run the program in some sort of debug state and wait until a specific instruction is reached - I am assuming that the instruction would always be in a specific memory location because I used -no-pie at compilation.
  2. Follow the global array pointer to the dynamically allocated memory - once again I am assuming that -no-pie will always ensure that the global pointer is at a specific memory location.
  3. Dump a specific number of bytes from the dynamically allocated memory into a file.

I took a look at gdb, but most tutorials mention compiling with -g -Og, and I would prefer to avoid that in favor of raw memory locations for checkpoints and the location of array. Is this possible with gdb, or will I have to use another tool?

Update:

I will be using an architecture simulator to intentionally inject bitflips into the program at various points. I want to determine how the bitflips corrupt the program data, specifically array. I want to know the value of array at various points in the program to see how the effect of the bitflip propagates.

I cannot add extra print functions in the middle of the program in the source code, because the program that I am testing must be comparable to a program that does the entire computation at once without any prints in between.


Solution

  • Is this possible with gdb

    Yes, GDB is perfectly capable of doing this.

    Let's start by making an actual working example:

    #include <stdlib.h>
    #include <stdio.h>
    
    const int SZ = 20;
    double *array = NULL;
    
    void foo()
    {
      for (int j = 0; j < SZ; j++)
        array[j] = 2.0;
    }
    
    void bar()
    {
      for (int j = 0; j < SZ; j++)
        array[j] /= 3.0;
    }
    
    
    int main()
    {
      array = malloc(sizeof(double) * SZ);
      foo();
      __asm__("nop");  // For a breakpoint
      bar();
      __asm__("nop");  // For a breakpoint
    
      return 0;
    }
    

    Compile this with gcc -O2 -fno-inline -no-pie t.c (using -fno-inline because otherwise foo() gets inlined).

    gdb -q ./a.out
    
    (gdb) disas main
    Dump of assembler code for function main:
       0x0000000000401040 <+0>:     sub    $0x8,%rsp
       0x0000000000401044 <+4>:     mov    $0xa0,%edi
       0x0000000000401049 <+9>:     callq  0x401030 <malloc@plt>
       0x000000000040104e <+14>:    mov    %rax,0x2fcb(%rip)        # 0x404020 <array>
       0x0000000000401055 <+21>:    xor    %eax,%eax
       0x0000000000401057 <+23>:    callq  0x401160 <foo>
       0x000000000040105c <+28>:    nop
       0x000000000040105d <+29>:    xor    %eax,%eax
       0x000000000040105f <+31>:    callq  0x401190 <bar>
       0x0000000000401064 <+36>:    nop
       0x0000000000401065 <+37>:    xor    %eax,%eax
       0x0000000000401067 <+39>:    add    $0x8,%rsp
       0x000000000040106b <+43>:    retq
    End of assembler dump.
    
    (gdb) b *0x000000000040105c
    Breakpoint 1 at 0x40105c
    (gdb) b *0x0000000000401064
    Breakpoint 2 at 0x401064
    
    (gdb) run
    Starting program: /tmp/a.out
    
    Breakpoint 1, 0x000000000040105c in main ()
    
    (gdb) p &array
    $1 = (<data variable, no debug info> *) 0x404020 <array>
    
    (gdb) p (*(double**)0x404020)[0]@20
    $2 = {2 <repeats 20 times>}
    
    (gdb) c
    Continuing.
    
    Breakpoint 2, 0x0000000000401064 in main ()
    (gdb) p (*(double**)0x404020)[0]@20
    $3 = {0.66666666666666663 <repeats 20 times>}
    

    As you can see, we've observed all the values we want.

    You can hard-code the addresses of the two breakpoints, and the address of the array into your script (here gdb.script):

    break *0x40105c
    break *0x401064
    run
    print (*(double**)0x404020)[0]@20
    cont
    print (*(double**)0x404020)[0]@20
    cont
    

    And run gdb as gdb -batch -x gdb.script ./a.out, which produces:

    Breakpoint 1 at 0x40105c
    Breakpoint 2 at 0x401064
    Breakpoint 1, 0x000000000040105c in main ()
    $1 = {2 <repeats 20 times>}
    
    Breakpoint 2, 0x0000000000401064 in main ()
    $2 = {0.66666666666666663 <repeats 20 times>}
    [Inferior 1 (process 1838617) exited normally]
    

    The only remaining part is the "dump bytes into a file", which can be achieved with the dump memory GDB command:

    (gdb) help dump memory
    Write contents of memory to a raw binary file.
    Arguments are FILE START STOP.  Writes the contents of memory within the
    range [START .. STOP) to the specified FILE in raw target ordered bytes.
    

    P.S. You could also get rid of the Breakpoint 1, 0x000000000040105c in main () output by setting the breakpoints to silent mode, like so:

    break *0x40105c
    commands
      silent
      print (*(double**)0x404020)[0]@20
      cont
    end
    
    break *0x401064
    commands
      silent
      print (*(double**)0x404020)[0]@20
      cont
    end
    
    run
    

    The output becomes:

    Breakpoint 1 at 0x40105c
    Breakpoint 2 at 0x401064
    $1 = {2 <repeats 20 times>}
    $2 = {0.66666666666666663 <repeats 20 times>}
    [Inferior 1 (process 1839098) exited normally]