Search code examples
pythonopencvhadoop-yarnconda

How conda handle .so of cv2 when --archive to yarn cluster


We are using cv2 (opencv-python) on worker node in pyspark, so we use conda pack and --archive to prepare the env via yarn cluster, but we encounter an error during running

ImportError: libgthread-2.0.so.0: cannot open shared object file: No such file or directory

However, we check the environment.zip generated by conda pack, there are many .so files in it.

Generally, if the .so exists, the solution is to add lib path via system environment variables or add /etc/ld.so.conf.d in python 3, but if so, many python packages depended on .so file would all raise error, but this is the first time I see this error, which is caused by cv2

conda pack use ubuntu 14, yarn os version is ubuntu 16

What is the possible cause? How conda handle the conda pack zip file when using yarn cluster?


Solution

  • We have found the answer, it is relative to opencv version

    So far, we found opencv==3.4.2 works