Search code examples
jenkinsgroovytimeoutjenkins-pipelinejenkins-groovy

timeout for node allocation in Jenkins using Groovy-Script


Question: I need to set 2 hrs time for Node Allocation, Once Node is allocated within time limit, Build should continue, If node is not allocated within time frame, Build should Aboarted.

I tried with timeout function, But in these, If node is allocated, once the time limit is reached then build is aborted in mid of exection.

startTime = System.currentTimeMillis()
timeout(time:2, unit: 'HOURS'){ 
node('Slave_Node') {
   // Will run on the slave

}
}

In console it will print during waiting like, "Waiting for next available executor on 'Slave_Node'"

Once the node is allocated, "Running on Slave_Node"

Can suggest Implementation plz. Thanks :)


Solution

  • Here is a solution which I believe works standalone - though be careful as it may not be COMPLETE. I was not able to turn this into a "getNode()" library function to make it portable and reusable.

    wait_for_node = 120 // minutes
    waitNode_mS = wait_for_node * 60 * 1000        // minutes into milliseconds
    got_a_node = false
    targetLabel = 'DESIRED_AGENT'
    bodytimeout = 120 // minutes 
    
    parallel waitfor: {
        if (waitNode_mS > 0) {
            startms = System.currentTimeMillis()
            waitUntil(initialRecurrencePeriod: 5000, quiet: true)
                { got_a_node || ((System.currentTimeMillis() - startms) > waitNode_mS) } // end waitUntil
            if (!got_a_node) {
                error "ABORTING: Failed to get $targetLabel in $wait_for_node minutes"
            } else {
                echo "Got a $targetLabel - Exiting waitfor"
            }
        } else {
            echo "WARNING: MAY wait FOREVER."
        }// if time argument = 0, exit cleanly and thus will wait indefinitely
    }, executeon: {
        node(targetLabel) {
            got_a_node = true
            try {
                timeout(time: bodytimeout, unit: 'MINUTES') {
                    // -------------------------------
                    // MAIN BODY: DO THE WORK IN HERE
                    // -------------------------------
                } // end bodyTimeout
            } catch (org.jenkinsci.plugins.workflow.steps.FlowInterruptedException e) {
                throw (e)
            } finally {
                // When leaving this node, clean up the workspace.
                deleteDir()
            } // try/catch/finally executing on node
        } // exit node(targetLabel)
    }, failFast: true
    

    Notes about this solution:

    • The two maximum times are selected with the wait_for_node (minutes) and bodytimeout (minutes) values.
    • Use a parallel step where the waitfor leg uses a waitUntil step.
    • If the waitUntil (which brute force watches the system clock) exits, it all halts due to the fact that it throws an error and the parallel has failFast: true
    • if wait_for_node is set to zero it will wait forever to get the node. (Legacy behavior)
    • There may be race conditions. For example if the waitfor time is reached and the node is allocated simultaneously it will abort the MAIN BODY.
    • The inner (MAIN BODY) work is wrapped in a timeout() step.
    • The inner waitUntil isn't particularly efficient. The longest it actually waits at one interval is 30 seconds as it was intended for short waits.