HDFS staging dir permission issues with yarn mapred framework - /tmp/hadoop-yarn/staging

After starting a two node cluster using the official guide (r3.3.4), any subsequent YARN application fails due to /tmp/hadoop-yarn being owned by mapred as it is created when starting the JobHistory Server.

I've tried the following in mapred-site.xml to accommodate for multiple users:

<property>
    <name>yarn.app.mapreduce.am.staging-dir</name>
    <value>/tmp/hadoop-${user.name}/staging</value>
</property>

But in this case an issue with log4js appears (log4j:WARN No appenders could be found for logger) which makes no sense since the log4js.properties file is untouched in the etc/hadoop directory.

How would one configure this behaviour properly?

Steps:

[hdfs] $ $HADOOP_HOME/sbin/start-dfs.sh
[hdfs] $ hdfs dfs -mkdir -p /user/joe
[hdfs] $ hdfs dfs -chown joe:joe /user/joe
[yarn] $ $HADOOP_HOME/sbin/start-yarn.sh
[mapred] $ mapred --daemon start historyserver
[joe] $ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar pi 1 10

Errors

Without changing the staging dir:

...
Number of Maps  = 1
Samples per Map = 10
Wrote input for Map #0
Starting Job
2023-03-11 12:31:55,998 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at some.server/192.168.0.172:8032
org.apache.hadoop.security.AccessControlException: Permission denied: user=joe, access=EXECUTE, inode="/tmp/hadoop-yarn":mapred:supergroup:drwxrwx---
...

Staging dir set to `/tmp/hadoop-${user.name}/staging:

...
2023-03-11 12:42:15,054 INFO mapreduce.Job: Job job_1678534891465_0001 failed with state FAILED due to: Application application_1678534891465_0001 failed 2 times due to AM Container for appattempt_1678534891465_0001_000002 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2023-03-11 12:42:28.473]Exception from container-launch.
Container id: container_1678534891465_0001_02_000001
Exit code: 1

[2023-03-11 12:42:28.477]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
...

Solution

The staging directory is irrelevant. In fact, you should avoid having it be /tmp in a production environment.

The log4j.properties file in /etc/hadoop is only for hadoop daemons, not mapreduce applications, which each need their own log4j.properties files on their classpath

You need to fix inode permission issue by

Create an actual UNIX user joe on the Namenode server, not just the host where you execute yarn/hdfs commands
Staging directory should be writable by all users, so you could hadoop fs chown -R 777 /tmp/hadoop-yarn, or ensure joe is in the supergroup group, which does have access