Search code examples
c++linuxgcc4.7

Super weird segfault with gcc 4.7 -- Bug?


Here is a piece of code that I've been trying to compile:

#include <cstdio>

#define N 3

struct Data {
    int A[N][N];
    int B[N];
};

int foo(int uloc, const int A[N][N], const int B[N])
{
    for(unsigned int j = 0; j < N; j++) {
        for( int i = 0; i < N; i++) {
            for( int r = 0; r < N ; r++) {
                for( int q = 0; q < N ; q++) {
                   uloc += B[i]*A[r][j] + B[j];
                }
            }
        }
    }
    return uloc;
}

int apply(const Data *d)
{
    return foo(4,d->A,d->B);
}

int main(int, char **)
{
    Data d;
    for(int i = 0; i < N; ++i) {
        for(int j = 0; j < N; ++j) {
            d.A[i][j] = 0.0;
        }
        d.B[i] = 0.0;
    }

    int res = 11 + apply(&d);

    printf("%d\n",res);
    return 0;
}

Yes, it looks quite strange, and does not do anything useful at all at the moment, but it is the most concise version of a much larger program which I had the problem with initially.

It compiles and runs just fine with GCC(G++) 4.4 and 4.6, but if I use GCC 4.7, and enable third level optimizations:

g++-4.7 -g -O3 prog.cpp -o prog

I get a segmentation fault when running it. Gdb does not really give much information on what went wrong:

(gdb) run
Starting program: /home/kalle/work/code/advect_diff/c++/strunt 

Program received signal SIGSEGV, Segmentation fault.
apply (d=d@entry=0x7fffffffe1a0) at src/strunt.cpp:25
25      int apply(const Data *d)
(gdb) bt
#0  apply (d=d@entry=0x7fffffffe1a0) at src/strunt.cpp:25
#1  0x00000000004004cc in main () at src/strunt.cpp:34

I've tried tweaking the code in different ways to see if the error goes away. It seems necessary to have all of the four loop levels in foo, and I have not been able to reproduce it by having a single level of function calls. Oh yeah, the outermost loop must use an unsigned loop index.

I'm starting to suspect that this is a bug in the compiler or runtime, since it is specific to version 4.7 and I cannot see what memory accesses are invalid.

Any insight into what is going on would be very much appreciated.

It is possible to get the same situation with the C-version of GCC, with a slight modification of the code.

My system is:

Debian wheezy Linux 3.2.0-4-amd64 GCC 4.7.2-5


Okay so I looked at the disassembly offered by gdb, but I'm afraid it doesn't say much to me:

Dump of assembler code for function apply(Data const*):
   0x0000000000400760 <+0>: push   %r13
   0x0000000000400762 <+2>: movabs $0x400000000,%r8
   0x000000000040076c <+12>:    push   %r12
   0x000000000040076e <+14>:    push   %rbp
   0x000000000040076f <+15>:    push   %rbx
   0x0000000000400770 <+16>:    mov    0x24(%rdi),%ecx
=> 0x0000000000400773 <+19>:    mov    (%rdi,%r8,1),%ebp
   0x0000000000400777 <+23>:    mov    0x18(%rdi),%r10d
   0x000000000040077b <+27>:    mov    $0x4,%r8b
   0x000000000040077e <+30>:    mov    0x28(%rdi),%edx
   0x0000000000400781 <+33>:    mov    0x2c(%rdi),%eax
   0x0000000000400784 <+36>:    mov    %ecx,%ebx
   0x0000000000400786 <+38>:    mov    (%rdi,%r8,1),%r11d
   0x000000000040078a <+42>:    mov    0x1c(%rdi),%r9d
   0x000000000040078e <+46>:    imul   %ebp,%ebx
   0x0000000000400791 <+49>:    mov    $0x8,%r8b
   0x0000000000400794 <+52>:    mov    0x20(%rdi),%esi

What should I see when I look at this?


Edit 2015-08-13: This seem to be fixed in g++ 4.8 and later.


Solution

  • It indeed and unfortunately is a bug in gcc. I have not the slightest idea what it is doing there, but the generated assembly for the apply function is ( I compiled it without main btw., and it has foo inlined in it):

    _Z5applyPK4Data:
            pushq   %r13
            movabsq $17179869184, %r8
            pushq   %r12
            pushq   %rbp
            pushq   %rbx
            movl    36(%rdi), %ecx
            movl    (%rdi,%r8), %ebp
            movl    24(%rdi), %r10d
    

    and exactly at the movl (%rdi,%r8), %ebp it will crashes, since it adds a nonsensical 0x400000000 to $rdi (the first parameter, thus the pointer to Data) and dereferences it.