Search code examples
c++dockerpipc++17

Error installing Horovod in Docker - PIP - C++17


I try to create a docker image for our training machine. Installation of horovod for python fails. It seems that the issue is that a c++17 compiler is not used.

My dockerfile so far:

FROM nvidia/cuda:12.2.0-devel-ubuntu22.04

ENV DEBIAN_FRONTEND noninteractive

RUN echo "root:123123" | chpasswd

RUN groupadd -g 1000 user1 && useradd user1 -u 1000 -g 1000 -m -s /bin/bash

RUN apt-get -y update
RUN apt-get -y upgrade

RUN apt-get install -y python3-pip python3-setuptools python3-opencv sudo vim wget cmake 

ENV PYTHON_BIN "python3.10"
ENV PYTHON_SITE_PACKAGES="/usr/local/lib/{PYTHON_BIN}/dist-packages"

RUN /usr/bin/update-alternatives --install /usr/bin/python python /usr/bin/${PYTHON_BIN} 10 && \
    /usr/bin/update-alternatives --install /usr/bin/python3 python3 /usr/bin/${PYTHON_BIN} 10

RUN ${PYTHON_BIN} -m pip install torch torchvision pyyaml pandas scikit-image openexr

I build the image with docker build --rm --no-cache --tag my_image --file "./Dockerfile"

and start it with docker run --gpus all -it --rm --user $(id -u):$(id -g) --entrypoint /bin/bash my_image

After I entered the docker container and made myself to root, my attempt to install horovod is MMVC_CUDA_ARGS="-std=c++17" HOROVOD_BUILD_ARCH_FLAGS="-std=c++17" HOROVOD_WITHOUT_MXNET=1 HOROVOD_WITHOUT_TENSORFLOW=1 pip install horovod[pytorch]

According to the documentation (https://horovod.readthedocs.io/en/stable/install.html) a c++17 compiler is required, which should be installed.

# `which g++` --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

An excerpt from the error message indicates, that the compiler cannot do c++17:

[ 97%] Building CXX object horovod/torch/CMakeFiles/pytorch.dir/cuda_util.cc.o
      cd /tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/build/temp.linux-x86_64-3.10/RelWithDebInfo/horovod/torch && /usr/bin/c++ -DEIGEN_MPL2_ONLY=1 -DHAVE_CUDA=1 -DHAVE_GLOO=1 -DHAVE_GPU=1 -DHAVE_NVTX=1 -DPYTORCH_VERSION=2003000000 -DTORCH_API_INCLUDE_EXTENSION_H=1 -Dpytorch_EXPORTS -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/HTTPRequest/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/assert/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/config/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/core/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/detail/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/iterator/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/lockfree/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/mpl/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/parameter/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/predef/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/preprocessor/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/static_assert/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/type_traits/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/boost/utility/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/lbfgs/include -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/gloo -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/eigen -I/tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/third_party/flatbuffers/include -isystem /usr/local/cuda/include -isystem /usr/local/cuda/targets/x86_64-linux/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.10/dist-packages/torch/include/THC -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0  -pthread -fPIC -Wall -ftree-vectorize -mf16c -mavx -mfma -O3 -g -DNDEBUG -fPIC -std=c++14 -MD -MT horovod/torch/CMakeFiles/pytorch.dir/cuda_util.cc.o -MF CMakeFiles/pytorch.dir/cuda_util.cc.o.d -o CMakeFiles/pytorch.dir/cuda_util.cc.o -c /tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/horovod/torch/cuda_util.cc
      In file included from /tmp/pip-install-qqukqcka/horovod_0b6322654f564ffc82c68379b1882f61/horovod/torch/cuda_util.cc:22:
      /usr/local/lib/python3.10/dist-packages/torch/include/ATen/ATen.h:4:2: error: #error C++17 or later compatible compiler is required to use ATen.
          4 | #error C++17 or later compatible compiler is required to use ATen.

So I guess my attempts to use c++17 was wrong. How can I tell pip to use a c++17 compiler for installation.


Solution

  • After a long odyssey I finally found the crucial hint in the horovod issue section at github. For some reason I did not find it at the beginning of my search.

    HOROVOD_WITHOUT_MXNET=1 HOROVOD_WITHOUT_TENSORFLOW=1 pip install git+https://github.com/thomas-bouvier/horovod.git@compile-cpp17[pytorch]

    They fixed the issue on another branch.