Search code examples
pythonapache-piguser-defined-functions

Python UDF in Pig


Whenever I try to import external packages of python in a pig udf, it shows the following error

Python Error. Traceback (most recent call last): File "pythonudf.py", line 5, in from bs4 import BeautifulSoup ImportError: No module named bs4

I've tried including the library path

import sys
sys.path.append('/usr/local/lib/python3.5/dist-packages')

And set

export JYTHONPATH=$JYTHONPATH:/usr/local/lib/python3.5/dist-packages

But it is still showing the same error. What else can I do? The script isn't running in local or mapreduce mode.

PS: Other functions which do not import external packages are running perfectly.

EDIT: The packages in the python code are installed.


Solution

  • Use -embedded option when executing pig with python udf importing packages.Reference

    pig -embedded jython pythonudf.py