Search code examples
amazon-web-servicesamazon-s3elastic-map-reduce

Can an EMR cluster be launched into a private VPC subnet with no public IPs that accesses the internet through a NAT instance in a public subnet?


Is it possible to launch an EMR cluster into the private subnet of a scenario-2 VPC (http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html) where a NAT instance is in the public subnet, and where each instance in the private subnet does not have a public IP?

On the one hand, I see "Additionally, you cannot use Amazon EMR through a Network Address Translation (NAT) device, but you can still use a NAT for other traffic in more complex scenarios." at http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-vpc-subnet.html . We also see StackOverflow questions like Create EMR Cluster with No Public IP Addresses that indicate that private clusters are not supported. However, I now understand that S3 is a valid VPC endpoint, and I'm wondering if this has changed the story.


Solution

  • At this time (as of Nov 8th, 2015), EMR only supports public subnet meaning that the default gateway (0.0.0.0/0) must route to an IGW (internet gateway in VPC).

    Update: As of December 22, 2015 starting with release 4.2.0 EMR can now work in a private subnet. Announcement at https://forums.aws.amazon.com/ann.jspa?annID=3447 and example here https://blogs.aws.amazon.com/bigdata/post/Tx349CL210VDYQF/Securely-Access-Web-Interfaces-on-Amazon-EMR-Launched-in-a-Private-Subnet