Search code examples
pandasamazon-emrapache-zeppelin

ImportError: No module named pandas in Zeppelin (EMR)


I have an EMR cluster with Spark/Hive/Zeppelin. In my Zeppelin notebook, I tried to import pandas:

import pandas as pd

But I got this error:

ImportError: No module named pandas

How can I resolve this issue? Is this because pandas not installed in the EMR?


Solution

  • It was a matter of installing pandas in the master node:

    sudo pip install pandas