After starting a two node cluster using the official guide (r3.3.4), any subsequent YARN application fails due to /tmp/hadoop-yarn
being owned by mapred
as it is created when starting the JobHistory Server.
I've tried the following in mapred-site.xml
to accommodate for multiple users:
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/tmp/hadoop-${user.name}/staging</value>
</property>
But in this case an issue with log4js appears (log4j:WARN No appenders could be found for logger
) which makes no sense since the log4js.properties
file is untouched in the etc/hadoop
directory.
How would one configure this behaviour properly?
[hdfs] $ $HADOOP_HOME/sbin/start-dfs.sh
[hdfs] $ hdfs dfs -mkdir -p /user/joe
[hdfs] $ hdfs dfs -chown joe:joe /user/joe
[yarn] $ $HADOOP_HOME/sbin/start-yarn.sh
[mapred] $ mapred --daemon start historyserver
[joe] $ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar pi 1 10
...
Number of Maps = 1
Samples per Map = 10
Wrote input for Map #0
Starting Job
2023-03-11 12:31:55,998 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at some.server/192.168.0.172:8032
org.apache.hadoop.security.AccessControlException: Permission denied: user=joe, access=EXECUTE, inode="/tmp/hadoop-yarn":mapred:supergroup:drwxrwx---
...
...
2023-03-11 12:42:15,054 INFO mapreduce.Job: Job job_1678534891465_0001 failed with state FAILED due to: Application application_1678534891465_0001 failed 2 times due to AM Container for appattempt_1678534891465_0001_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2023-03-11 12:42:28.473]Exception from container-launch.
Container id: container_1678534891465_0001_02_000001
Exit code: 1
[2023-03-11 12:42:28.477]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
...
The staging directory is irrelevant. In fact, you should avoid having it be /tmp in a production environment.
The log4j.properties file in /etc/hadoop is only for hadoop daemons, not mapreduce applications, which each need their own log4j.properties files on their classpath
You need to fix inode permission issue by
joe
on the Namenode server, not just the host where you execute yarn/hdfs commandshadoop fs chown -R 777 /tmp/hadoop-yarn
, or ensure joe is in the supergroup
group, which does have access