Search code examples
hadoopoozieoozie-coordinator

Oozie Behavior with misaligned start


I noticed that if I start an Oozie coordinator with a start time many "iterations" (in terms of the frequency) previous to the current time, then the coordinator would sequentially run workflows several times, ignoring the assigned frequency. However, for me it is more important that the workflow/action run itself at the assigned frequency, than it is for workflow/action to have run the correct number of times at a given point.

Is there any way I can avoid this behavior? One way would obviously be to ensure the start time is correct within an iteration time (is there a way to have it automatically take the start time?). Another would be to configure it to avoid this behavior altogether, and basically run at the next time when it should have given the start time and the frequency.


Solution

  • The obvious way to avoid side effects from "past" start dates is... to set the actual start date at submission time as "now".

    That's the way we do it in my team:

    • on the local filesystem, write down a "Coord-template.xml" with a placeholder such as start="%Now%"
    • just before submitting, generate the actual "Coordinator.xml" with

      sed "s/%Now%/$(date --utc '+%FT%TZ')/" coord-template.xml > coordinator.xml

    • upload the coordinator definition to HDFS then submit it via Oozie CLI

    ~~~~~~~~~~~~

    Aternative: if you are using "basic" frequency (not CRON-like scheduling) you may want to try these <controls> to have Oozie create executions for all "past" time slots but discard them immediately :

      <throttle>1</throttle>
    

    and/or

      <execution>LAST_ONLY</execution>
    

    cf. Oozie 4.x reference

    The rules would also apply in case the Coordinator is suspended then resumed, or in case the Oozie service gets stopped then restarted, or in case YARN has to queue new jobs for a really long time (because the cluster is 100% busy).