Search code examples
oozieoozie-workflow

OOZIE stuck in RUNING status


I use OOZIE to run a workflow. But a simple official example shell-wf (echo hello oozie) stuck in RUNNING state and never end. The workflow can be submitted but stuck at RUNNING state. There is not any error in job log in OOZIE UI.

When submitting a shell with spark-submit inside, the job will be never submitted and can not be seen in Spark UI. I suspect the shell didn't run at all.

What's the possible problem?


Solution

  • A Quick Checklist

    For those who have the same problem, there is a checklist to check your system. Hope it helps!

    1. Check jobTracker in your Oozie configuration. Note: If a job has been successfully run, it probably not the problem of jobTracker. Related discussion can be found here
    2. Check your disk usage. If ## Heading ##disk usage is greater than 90%, remove some files to make sure disk usage is less than 90%. (That's my case!)
    3. Check Console URL of the stuck action. It can be found in Job - Job Info tab - Actions - Action - Action Info tab. Job state here may help you to find the problem.
    4. Check Oozie log. It's typically in /usr/local/oozie/logs. Check oozie.log* to find if there are exceptions.

    Details

    Disk usage

    If your action state is

    YarnApplicationState: ACCEPTED: waiting for AM container to be allocated, launched and register with RM.

    That may be the disk problem. Relative discussion can be found in MapReduce job hangs, waiting for AM container to be allocated. Solutions can be found in Why does Hadoop report "Unhealthy Node local-dirs and log-dirs are bad"?.