I am setting up an EMR job and finding that I must specify Master and Core/Task specific security groups. What is the point of having 2? If I run in client
mode - I will only utilize the Master security group anyways. And I believe if I run the EMR job on cluster
mode it should only utilize the security group of core/task is this not correct?
That is at least my understanding since when I choose between client
or cluster
mode it tells me this:
Run your driver on a slave node (cluster mode) or on the master node as an external client (client mode).
As per Working With Amazon EMR-Managed Security Groups:
The Security Group on the Master node allows:
The Security Group on the Core/Task nodes allows:
Typically, the Security Group on the Master node is also opened so that you can directly connect with it (eg to run command-line Hive).
Access to the Core/Task nodes is exclusively done via the Master node. Any submitted jobs go to the Master node, then to the Core/Task nodes.