We have a multi-process application where debugging a specific child process is proving difficult. Due to messaging timeouts between the processes we don't have time to attach gdb to the target child, so I was wondering if I can stop a process via an systemtap probe?
A simple probe should, I think, be needed, eg.:
probe process("exeName").mark("STOP_HERE")
{
force_sig(SIGSTOP, current);
}
Unfortunately, the above's not compiling ... any ideas?
I'm not a systemtap expert, so this isn't probably the best solution, but here's my crude solution for anyone interested:
#!/bin/stap -g
global gdbRunning = 0;
probe process(@1).mark(@2)
{
raise(%{ SIGSTOP %});
gdbCmd = sprintf("cgdb -- -q -ex 'thread find %d' %s %d", tid(), @1, pid());
if (gdbRunning == 0)
{
gdbRunning = 1;
printf("STOP PID %d TID %d [%s]\n", pid(), tid(), gdbCmd);
system(gdbCmd);
}
else
{
printf("STOP PID %d TID %d\n", pid(), tid());
}
}
See man function::raise(3stap), new as of systemtap 2.3 (2013-07-25).
stap -g -e 'probe WHATEVER { raise(%{ SIGSTOP %}) }'
You need guru mode to let your script use this function.