I'm analyzing a core dump file created by SIGSEV using gdb. I get the line number for the C source, but when I evaluate the expression, I get the correct value (the expression is
local_var = ((array[index])->field[index2]).field2
where array
is a global variable). The values of index
and index2
are optimized out (of course :-( ), but I computed them a couple of times and each time I got the same valid value. Out of despair I've checked the disassembled code and the registers and got this:
0x00002b083e06d84c <+142>: mov %r13d,%edx # index (234) to edx
0x00002b083e06d84f <+145>: mov 0x2039fa(%rip),%rax # 0x2b083e271250 (address of array)
0x00002b083e06d856 <+152>: mov (%rax,%rdx,8),%rdx # array[index] (0x2b083e271250+8*234) to rdx
0x00002b083e06d85a <+156>: movslq %ecx,%rax # index2 to rax
=> 0x00002b083e06d85d <+159>: mov 0x28(%rdx),%rdx # array[index]->field to rdx
The comments are my understanding of the code. The SIGSEV is received at the last instruction. The contents of the registers:
rax 0x5 5
rbx 0x2aaad4096a9c 46913190193820
rcx 0x5 5
rdx 0x0 0
rsi 0xea 234
rdi 0xc75000a9 3343909033
rbp 0x41f898c0 0x41f898c0
rsp 0x41f898a0 0x41f898a0
r8 0x2aaacb411c60 46913042848864
r9 0x2020202020207475 2314885530818475125
r10 0x52203c3c20202020 5917796139299512352
r11 0x2b083bb29070 47314361290864
r12 0xc75000a9 3343909033
r13 0xea 234
r14 0x0 0
r15 0x2aaad40966a4 46913190192804
rip 0x2b083e06d85d 0x2b083e06d85d
Because rdx
is 0, I understand the segmentation fault in the last segment, because the code tried to read from 0x28
which is not accessible. What I don't understand is why rdx
is 0? In the first line edx
gets the 234
value (the r13
register is not modified since that instruction and this is the valid value of index
I've computed). In the third line the 8 bytes at 0x2b083e5b6f20+(8*234)
= 0x2b083e5b7670
are assigned to rdx
, but those bytes are not 0:
(gdb) x/2 0x2b083e5b7670
0x2b083e5b7670: 0x3e578900 0x00002b08
How do rdx
ends up with the 0 value?
I'm doing this on x86_64 Linux and this is a multithreaded program. Could this be a hardware error? The SIGSEV doesn't happen always.
this is a multithreaded program. The SIGSEV doesn't happen always.
It sounds like you may have a data race: after current thread loaded array[index])->field
(which was 0 at the time) some other thread came in and wrote a different value there (you now observe the new value in the core
).
Could this be a hardware error?
Everything is possible, but a data race is 99.99% more probable.