Search code examples
apache-sparkhadooppysparkairflowhadoop-yarn

Error "unknown queue: root.default" when spark-submitting to YARN


I am submitting a simple Pyspark wordcount job to a freshly built YARN cluster, via Airflow and the SparkSubmitOperator. The job hits YARN, I can see it in the ResourceManager UI, but fails with this error:

"Diagnostics: Application application_1582063076991_0002 submitted by user root to unknown queue: root.default"

*User:  root
Name:   PySpark Wordcount
Application Type:   SPARK
Application Tags:   
YarnApplicationState:   FAILED
Queue:  root.default
FinalStatus Reported by AM: FAILED
Started:    Fri Feb 21 08:01:25 +1100 2020
Elapsed:    0sec
Tracking URL:   History
Diagnostics:    Application application_1582063076991_0002 submitted by user root to unknown queue: root.default*

The default.root queue certainly seems to be there:

*Application Queues
  Legend:CapacityUsedUsed (over capacity)Max Capacity
      .root    0.0% used
          ..Queue: default    0.0% used


'default' Queue Status
Queue State:    RUNNING
Used Capacity:  0.0%
Configured Capacity:    100.0%
Configured Max Capacity:    100.0%
Absolute Used Capacity: 0.0%
Absolute Configured Capacity:   100.0%
Absolute Configured Max Capacity:   100.0%
Used Resources: <memory:0, vCores:0>
Num Schedulable Applications:   0
Num Non-Schedulable Applications:   0
Num Containers: 0
Max Applications:   10000
Max Applications Per User:  10000
Max Application Master Resources:   <memory:3072, vCores:1>
Used Application Master Resources:  <memory:0, vCores:0>
Max Application Master Resources Per User:  <memory:3072, vCores:1>
Configured Minimum User Limit Percent:  100%
Configured User Limit Factor:   1.0
Accessible Node Labels: *
Preemption: disabled*

What am I missing here ? Thanks


Solution

  • Submit with queue name default.

    The root in the Resource Manager is used only to group the queues in hierarchical form.