I'm running several simulations using Condor and have coded the program so that it outputs a progress status in the console. This is done at the end of a loop where it simply prints the current time (this can also be percentage or elapsed time). The code looks something like this:
printf("START"); while (programNeedsToRum) { // Run code repetitive code... // Print program status update printf("[%i:%i:%i]\r\n", hours, minutes, seconds); } printf("FINISH");
When executing normally (i.e. in the terminal/cmd/bash) this works fine, but the condor nodes don't seem to printf()
the status. Only once the simulation has finished, all the status updates have been outputted to the file but then it's no longer of use. My *.sub file that I submit to condor looks like this:
universe = vanilla executable = program output = out/out-$(Process) error = out/err-$(Process) queue 100
When submitted the program executes (this is confirmed in condor_q
) and the output files contain this:
START
Only once the program has finished running its corresponding output file shows (example):
START [0:3:4] [0:8:13] [0:12:57] [0:18:44] FINISH
Whilst the program executes, the output file only contains the START
text. So I came to the conclusion that the file is not updated if the node executing program is busy. So my question is, is there a way of updating the output files manually or gather any information on the program's progress in a better way?
Thanks already
Max
What you want to do is use the streaming output options. See the stream_error
and stream_output
options you can pass to condor_submit
as outlined here: http://research.cs.wisc.edu/htcondor/manual/current/condor_submit.html
By default, HTCondor stores stdout and stderr locally on the execute node and transfers them back to the submit node on job completion. Setting stream_output
to TRUE
will ask HTCondor to instead stream the output as it occurs back to the submit node. You can then inspect it as it happens.