Search code examples
multithreadingsegmentation-faultcoredumpcore-file

missing corefiles when SEGV occurs in thread different from main thread


I am currently debugging a segfault in one of our c++ applications and it gives me a hard time since no core files are generated when the segfault occurs.

After some reading and checking ulimits and so on I can reproduce the case of core files not being generated. It seems somehow to be related with threads. For that purpose I altered our software to artificially generate SEGV:

Now the following pattern emerges

  • SEGV in main thread -> core file is generated
  • SEGV in non-main thread -> no core file is generated

Then in order to not alter the program itself I also tried the same with sending signals.

  • Sending "kill -s SIGSEGV < pid >" -> core file is generated

Then I search und /proc/< pid >/task for a non main thread and took the id

  • Sending "kill -s SIGSEGV < threadid >" -> no core file is generated

Do you know of any thread specific properties that would explain such a behaviour?

I also tried the same code under different OS and this only occurs on our production environment (redhat6) and not under Ubuntu. I am still trying to figure out if it might be related to Debug/Non-Debug builds.

Still the behavior seems so strange that it must be because of some subtlety. I also wonder, if I wanted to create this behavior on purpose I would not even know what to change, in order to achieve this.

Any help is appreciated.

Best regards Matthias


Solution

  • For what its worth - it had something to do with the corepattern which I found out with some trial and error

    core_pattern  core                   -> corefile
    core_pattern  /opt/tmp/core          -> corefile
    core_pattern  /opt/tmp/core_%e.%p    -> no corefile
    core_pattern  /opt/tmp/core_%e       -> no corefile
    core_pattern  /opt/tmp/core_%h       -> corefile
    core_pattern  /opt/tmp/core_%h_%p    -> corefile
    core_pattern  /opt/tmp/core_%h_%p_%e -> no corefile
    

    So the %e seems to be reason why sometimes no core is written. Then core dump filename gets thread name instead of executable name with core_pattern %e.%p.core explains what is going on - namely that %e is not the executable name but contains information about the threads - which in my case contains "/"

    This also explains why segv in different threads behave differently and also why my simplest programs did not show the problem - as there was no code give names to the threads