I have an EMR cluster with Spark/Hive/Zeppelin. In my Zeppelin notebook, I tried to import pandas:
import pandas as pd
But I got this error:
ImportError: No module named pandas
How can I resolve this issue? Is this because pandas not installed in the EMR?
It was a matter of installing pandas in the master node:
sudo pip install pandas