I am submitting a simple Pyspark wordcount job to a freshly built YARN cluster, via Airflow and the SparkSubmitOperator. The job hits YARN, I can see it in the ResourceManager UI, but fails with this error:
"Diagnostics: Application application_1582063076991_0002 submitted by user root to unknown queue: root.default"
*User: root
Name: PySpark Wordcount
Application Type: SPARK
Application Tags:
YarnApplicationState: FAILED
Queue: root.default
FinalStatus Reported by AM: FAILED
Started: Fri Feb 21 08:01:25 +1100 2020
Elapsed: 0sec
Tracking URL: History
Diagnostics: Application application_1582063076991_0002 submitted by user root to unknown queue: root.default*
The default.root queue certainly seems to be there:
*Application Queues
Legend:CapacityUsedUsed (over capacity)Max Capacity
.root 0.0% used
..Queue: default 0.0% used
'default' Queue Status
Queue State: RUNNING
Used Capacity: 0.0%
Configured Capacity: 100.0%
Configured Max Capacity: 100.0%
Absolute Used Capacity: 0.0%
Absolute Configured Capacity: 100.0%
Absolute Configured Max Capacity: 100.0%
Used Resources: <memory:0, vCores:0>
Num Schedulable Applications: 0
Num Non-Schedulable Applications: 0
Num Containers: 0
Max Applications: 10000
Max Applications Per User: 10000
Max Application Master Resources: <memory:3072, vCores:1>
Used Application Master Resources: <memory:0, vCores:0>
Max Application Master Resources Per User: <memory:3072, vCores:1>
Configured Minimum User Limit Percent: 100%
Configured User Limit Factor: 1.0
Accessible Node Labels: *
Preemption: disabled*
What am I missing here ? Thanks
Submit with queue name default
.
The root
in the Resource Manager is used only to group the queues in hierarchical form.