Search code examples
apache-sparkhadoop-yarn

Is it possible to assign hosts for Spark tasks with regex


Spark on yarn mode, the cluster has nodes like nodeA-xx and nodeB-xx, is there any configurations to launch tasks run on hosts named nodeA-*


Solution

  • If you're using Capacity Scheduler you need to enable the feature called Node labels. YARN Node Labels won't allow you to specify a regex. Instead you would have to label the nodes then specify this label as a resource for a queue and then finally run the job against a specific queue.