i am stuck with this problem since several hours. I am trying to install psycopg2-binary via my Dockerfile:
FROM jupyter/pyspark-notebook:latest
RUN pip install psycopg2-binary
COPY Spark.py /app/Spark.py
COPY postgres-jdbc-driver.jar /app/postgres-jdbc-driver.jar
ENV SPARK_CLASSPATH=/app/postgres-jdbc-driver.jar
CMD ["python", "/app/Spark.py"]
But it does not install it. When i look in my docker container "pip list" i dondt see the package there. Any idea why? When i install it manually to my docker "pip install psycopg2-binary" it is working.
This is my Dockercompose file
jupyter-pyspark-notebook:
build:
context: .
dockerfile: ./DockerfileSpark
hostname: jupyter-pyspark-notebook
container_name: jupyter-pyspark-notebook
ports:
- "8888:8888"
restart: always
volumes:
- ./Spark.py:/app/Spark.py
command: sh -c "sleep 40 && python /app/Spark.py"
This is the error i get which makes sense because there is not psycopg2-binary installed in the container. But why is it not installing. I have no idea ...
File "/app/Spark.py", line 3, in <module>
2023-10-31 11:46:22 import psycopg2
2023-10-31 11:46:22 ModuleNotFoundError: No module named 'psycopg2'
Probably it is caching problem.
please run
docker-compose build --no-cache
docker-compose up