Search code examples
hadoopmapreduceoozieoozie-coordinator

Oozie output-events


I don't understand what are the uses for the output-events in Ozzie. the Ozzie docs states that "A coordinator action can produce one or more dataset(s) instances as output", but it doesn't give any practical details or examples. what does it mean to produce a dataset instance as output? does it mean that Ozzie will create as an output a folder by the dataset's URI template? I dont really understand why should I use output evets...

Thanks!


Solution

  • If you are talking about Oozie, the output files are used to connecting different coordinator jobs. Consider a big DAG of coordinator jobs, some job might take other jobs' output as its input. So the datasets are the edges in the DAG.

    For example, in the Oozie configuration file, if you specify Coordinator A's output is DS1, Coordinator B's output is DS2, and Coordinator C's input is DS1, and DS2, then Oozie will guarantee you that the corresponding action in Coordinator C will not be executed before DS1 and DS2 are ready.