Search code examples
dockerapache-sparkjupyter-notebookjupyterjupyter-server

Docker - all-spark-notebook Communications link failure


I'm new using docker and spark.

My docker-compose.yml file is

volumes:
  shared-workspace:
services:
  notebook:
    image: docker.io/jupyter/all-spark-notebook:latest
    build:
      context: .
      dockerfile: Dockerfile-jupyter-jars
    ports:
      - 8888:8888
    volumes:
      - shared-workspace:/opt/workspace

And the Dockerfile-jupyter-jars is:

FROM docker.io/jupyter/all-spark-notebook:latest
USER root
RUN wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.28/mysql-connector-java-8.0.28.jar
RUN mv mysql-connector-java-8.0.28.jar /usr/local/spark/jars/
USER jovyan

To it start up a run

docker-compose up --build

The server is up and running and I'm interested to use spark-sql, but it is throwing and error trying to connect to mysql server: com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure enter image description here

I can see the mysql-connector-java-8.0.28.jar in the "jars" folder, and I have used same sql instruction in apache spark non docker version and it works.

Mysql db server is also reachable from the same server I'm running the Docker. enter image description here

Do I need to enable something to reach external connections? Any idea?

Reference: https://hub.docker.com/r/jupyter/all-spark-notebook


Solution

  • The docker-compose.yml and Dockerfile-jupyter-jars files were correct, since I was using mysql-connector-java-8.0.28.jar it requires a SSL or to disable explicitly.

    jdbc:mysql://user:[email protected]:3306/inventory?useSSL=FALSE&nullCatalogMeansCurrent=true
    

    I'm going to left this example for: Docker - all-spark-notebook with MySQL dataset