Search code examples
amazon-web-servicesamazon-ec2apache-sparkamazon-emrapache-zeppelin

Amazon EMR Tunneling Zeppelin and Jupyter Notebook


I am running Spark EMR on Amazon EC2 and I am trying to tunnel Jupyter Notebook and Zeppelin, so I can access them locally.

I tried running the below command with no success:

ssh -i ~/user.pem -ND 8157 [email protected]

What exactly is tunnelling and how can I set it up so I can use Jupyter Notebook and Zeppelin on EMR?

Is there a way to I set up a basic configuration to make this work?

Many thanks.


Solution

  • Application ports like 8890, for Zeppelin on the master node, are not exposed outside of the cluster. So, if you are trying to access the notebook from your laptop, it will not work. SSH tunneling is a way to access these ports via SSH, securely. You are missing at least one step outlined in Set Up an SSH Tunnel to the Master Node Using Dynamic Port Forwarding. Specifically, "After the tunnel is active, configure a SOCKS proxy for your browser."