Search code examples
apache-sparkhadoophadoop-yarnapache-ranger

Apache ranger yarn plugin is not working as expected?


I have kerberized hadoop. I install ranger yarn plugin on resourcemanager & data nodes. Then i configure install.properties as below:

Repository Name       :  yarndev

Description           :  yarn repository

Username              :  yarn

Password              :  {password for yarn user in the system}

YARN REST URL         :  http://rm1:8088;http://rm2:8088

I ve created yarn plugin in ranger admin ui as yarndev

After I run enable.sh and restart both resourcemanager and nodemanager services.

I ve seen 2 plugin on admin panel. (there 2 plugin is yarn plugin which is on my resourcemanager nodes)

When I update policy the 2 nodes download the policy i can see.

The problem is It doesnt effect anything. I mean any user can submit spark or hive app on any queue what they select.

I want to put restrictions to use queues.

I have capacity scheduler xml to configure my que settings and i can see when i enable yarn plugin in yarn-site.xml there are 2 parts to added to last of the file like yarn.acl.enable to true and other thing is yarn.authorization-provider

the other thing is when i look on data nodes i can see there is a folder like /etc/ranger/yarndev/policycache/ but this folder is empty. it s like datanodes never get policy from ranger admin. I dont know why because there is no log under /var / log /yarn/ nodemanager.log

So what is the thing that i am missing?


Solution

  • I configure my each queue include root in capacity scheduler xml

    yarn.scheduler.capacity.root.<queue-path>.acl_submit_applications To space " "

    And

    yarn.scheduler.capacity.root.<queue-path>.acl_administer_queue To space and yarn " yarn" which means that only yarn group submit app in every queue

    But i apply above settings with just root without sub queue names like root.acl_administer_queue

    Thats it , it works like a charm