I'm trying to execute an oozie workflow that was written be a colleague. I execute this command:
oozie job -config ./job.properties -run
I have set up parameters in job.properties, including my user.name
, and I can see those values being used in the workflow when I examine the logs - creating files in my hdfs directory (e.g. exportDir=/user/${user.name}/ra_export
). But at some point in the workflow, it fails with permission errors, because it attempts to modify something in my colleague's directory. It's acting as if ${user.name}
was cached somewhere, and is using an old value. Has anyone seen behavior like this, and if so, what's the solution?
Update:
Here's the failing portion of the log:
1215755 [main] INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator - Moving tmp dir: hdfs://hadoop-name-01.mycompany.com:8020/tmp/hive-staging_hive_2015-08-06_19-51-57_511_3052536268795125086-1/_tmp.-ext-10000 to: hdfs://hadoop-name-01.mycompany.com:8020/tmp/hive-staging_hive_2015-08-06_19-51-57_511_3052536268795125086-1/-ext-10000
1215761 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=task.MOVE.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
1215762 [main] INFO org.apache.hadoop.hive.ql.exec.Task - Loading data to table client_reporting.campaign_web_events_export from hdfs://hadoop-name-01.mycompany.com:8020/tmp/hive-staging_hive_2015-08-06_19-51-57_511_3052536268795125086-1/-ext-10000
1215821 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Failed with exception Permission denied: user=clark.bremer, access=WRITE, inode="/user/john.smith/ra_export":john.smith:john.smith:drwxr-xr-x
But I can see from the top of the same log that the job.properties variable substitutions are taking place successfully:
Starting the execution of prepare actions
Deletion of path hdfs://hadoop-name-01.mycompany.com:8020/user/clark.bremer/foo_export succeeded.
Creating directory at /user/clark.bremer/foo_export succeeded.
Completed the execution of prepare actions successfully
But as you can see in the failing portion of the log, it's using both the wrong username (john.smith
instead of clark.bremer
), and the wrong export directory (ra_export
instead of foo_export
). John used ra_export
the last time he ran this workflow.
Here's a portion of my job.properties file:
user.name=clark.bremer
jobTracker=hadoop-name-01.mycompany.com:8032
nameNode=hdfs://hadoop-name-01.mycompany.com:8020
exportDir=/user/${user.name}/foo_export
And here's some snippets from the query that creates the table:
CREATE EXTERNAL TABLE IF NOT EXISTS client_reporting.campaign_web_events_export
....
stored as textfile location '${EXPORTDIR}/campaign_web_events';
insert overwrite table client_reporting.campaign_web_events_export
Where EXPORTDIR
is in my user directory.
The Hive Table which you are trying to access, have you checked, which User have been created the Hive Table.
can you drop the existing Hive Table and create a new table with your user, and run the same job and check the status