Search code examples
apache-sparkpysparkvirtualenvcondaredhat

conda environment for Pyspark - Redhat cluster but Mac computer


I need to make a Python environment to be packaged with conda-pack to be used in the archives configuration option in Pyspark (https://conda.github.io/conda-pack/spark.html). The cluster that I want to run Pyspark on does not have Internet access, so I need to prepare the conda environment on my local computer (Mac operating system) and scp it into the cluster, which runs on Redhat.

How can I prepare the conda environment when the operating systems are different?


Solution

  • Made the Python environment on a virtual machine (VirtualBox) using the free installation of RedHat distributed for developers. Packaging the environment using conda on RHEL 8.2 is compatible with RHEL 7.7