I am running a map-reduce job through oozie
. The command I use is as follows.
oozie job -verbose -oozie http://myoozieurl -config job.properties -run
How can I view the logs generated by the hadoop job? Is there a way I can see the logs generated or redicrect the logs to print on the terminal window?
If I run the job using (MapR) hadoop
command, I can see the output of the log commands on the terminal.
New to hadoop and oozie. So this may be a newbie oversight.
This post explains how to logs are managed during mapreduce jobs
https://discuss.zendesk.com/hc/en-us/articles/201925118
Once the job has completed the NodeManager will keep the log for each container for ${yarn.nodemanager.log.retain-seconds}
which is 10800 seconds by default ( 3 hours ) and delete them once they have expired. But if ${yarn.log-aggregation-enable}
is enabled then the NodeManager will immediately concatenate all of the containers logs into one file and upload them into HDFS in ${yarn.nodemanager.remote-app-log-dir}/${user.name}/logs/<application ID>
and delete them from the local userlogs directory. Log aggregation is enabled by default in PHD and it makes log collection convenient.
Example when log aggregation is enabled. We know there were 4 containers executed in this mapreduce job because "-m
" specified 3 mappers and the fourth container is the application master. Each NodeManager got at least one container so all of them uploaded a log file.
[gpadmin@hdm1 ~]$ hdfs dfs -ls /yarn/apps/gpadmin/logs/application_1389385968629_0025/
Found 3 items
-rw-r----- 3 gpadmin hadoop 4496 2014-02-01 16:54 /yarn/apps/gpadmin/logs/application_1389385968629_0025/hdw1.hadoop.local_30825
-rw-r----- 3 gpadmin hadoop 5378 2014-02-01 16:54 /yarn/apps/gpadmin/logs/application_1389385968629_0025/hdw2.hadoop.local_36429
-rw-r----- 3 gpadmin hadoop 1877950 2014-02-01 16:54 /yarn/apps/gpadmin