Search code examples
linuxjenkinstimeoutctest

How to execute custom command when ctest timeouts on jenkins


I have a Jenkins that executes ctest which in turn executes several unit tests. A global timeout of 120 minutes for a test run is configured.

One of my test programs gets sporadically stuck and killed by the configured timeout.

What I like to have is a core dump of the test program in the problem situation. So I'd like to execute a custom command (e.g. gcore XXX), whenever the timeout is reached.

How can I configure that in Jenkins and/or ctest?


Solution

  • I wrote my own, non-portable script to accomplish the job. Hopefully it serves as a help and/or inspiration for others...

    #!/usr/bin/env ruby
    
    #watchers the children of ctest. 
    #takes a gcore of the child and kills it, if its runtime exceeds the         configured timeout
    
    #make test will show a line like this, if this watch dog killed the test:
    #      Start 49: test_logging
    #49/86 Test #49: test_logging ......................***Exception: Other     62.33 sec
    
    require "time"
    
    TIMEOUT_SEC = (ENV["TIMEOUT_SEC"] || 23*60).to_i
    DIR_CORES = ENV["DIR_CORES"] || "/tmp/corefiles/"
    KILL_SIGNAL = ENV["KILL_SIGNAL"] || 9
    SLEEP_TIME_SEC = (ENV["SLEEP_TIME_SEC"] || 5).to_i
    
    puts "Started ctest watch dog."
    puts Process.daemon
    
    while true do
        pid_ctest = %x(pgrep ctest).strip
        if !pid_ctest.nil? && !pid_ctest.empty?
    #       puts "ctest: #{pid_ctest}"
            pid_child = %x(ps -o ppid= -o pid= -A | awk '$1 == #{pid_ctest}{print $2}').strip
            if !pid_child.nil? && !pid_child.empty?
    #           puts "child: #{pid_child}"
                runtime_child = %x(ps -o etime= -p #{pid_child}).strip
                timeary = runtime_child.strip.split(":")
                hour, min, sec = 0
                if timeary.length > 2
                    hour = timeary[0]
                    min = timeary[1]
                    sec = timeary[2]
                else
                    min = timeary[0]
                    sec = timeary[1]
                end
    
                res = %x(pstree #{pid_ctest})
                ary = res.split("-")
                ary.delete_if {|x| x.empty?}
                child_name = ary[1].strip
    
                t = hour.to_i*60*60 + min.to_i*60 + sec.to_i
                if t > TIMEOUT_SEC
                    puts "kill child: #{pid_child} #{runtime_child} #{t.to_i}"
    
                    puts "dumping core to #{DIR_CORES}/#{child_name}"
                    %x(gcore -o #{DIR_CORES}/#{child_name} #{pid_child} )
                    puts "killing with signal #{KILL_SIGNAL}"
                    %x(kill --signal #{KILL_SIGNAL} #{pid_child})
                else
                    puts "Letting child alive. ctest: #{pid_ctest}, child:     #{pid_child}, name: #{child_name}, runtime: #{runtime_child}, in sec: #{t}. Killing in #{TIMEOUT_SEC-t} sec"
                end
            end
        end
    
        sleep SLEEP_TIME_SEC
    end