Search code examples
fortranclockhpcnul

How to set internal wall clock in a Fortran program?


I use Fortran to do some scientific computation. I use HPC. As we know, when we submit jobs in a HPC job scheduler, we also specify the wall clock time limit for our jobs. However, when the time is up, if the job is still writing output data, it will be terminated and it will cause some 'NUL' values in the data, causing trouble for the post-processing:

enter image description here

So, could we set an internal mechanism that our job can stop itself peacefully some time before the end of HPC allowance time?

Related Question: How to skip reading "NUL" value in MATLAB's textscan function?


Solution

  • After realizing what you are asking I found out that I implemented similar functionality in my program very recently (commit https://bitbucket.org/LadaF/elmm/commits/f10a1b3421a3dd14fdcbe165aa70bf5c5001413f). But I still have to set the time limit manually.

    The most important part:

    time_stepping%clock_time_limit is the time limit in seconds. Count the number of system clock ticks corresponding to that:

        call system_clock(count_rate = timer_rate)
        call system_clock(count_max = timer_max_count)   
    
        timer_count_time_limit = int( min(time_stepping%clock_time_limit &
                                            * real(timer_rate, knd),  &
                                          real(timer_max_count, knd) * 0.999_dbl) &
                                    , dbl)  
    

    Start the timer

    call system_clock(count = time_steps_timer_count_start)  
    

    Check the timer and exit the main loop with error_exit set to .true. if the time is up

      if (mod(time_step,time_stepping%check_period)==0) then
        if (master) then
          error_exit = time_steps_timer_count_2 - time_steps_timer_count_start > timer_count_time_limit
          if (error_exit) write(*,*) "Maximum clock time exceeded."
        end if
    
        MPI_Bcast the error exit to other processes
    
        if (error_exit) exit
      end if
    

    Now, you may want to get the time limit from your scheduler automatically. That will vary between different job scheduling softwares. There will be an environment variable like $PBS_WALLTIME. See Get walltime in a PBS job script but check your scheduler's manual.

    You can read this variable using GET_ENVIRONMENT_VARIABLE()