Search code examples
dockerapache-sparkkubernetesminikubek8s-rolebinding

Error from server (BadRequest): container "spark-kubernetes-driver" in pod "test-run-spark" is waiting to start: trying and failing to pull image


minikube in mac os is not able to pull docker images from docker repository.

Trying to run spark on k8s

spark-submit --master k8s://https://ip:port --deploy-mode cluster --name test-run-spark --conf spark.kubernetes.container.image=Docker-image --conf spark.kubernetes.driver.container.image=docker4tg/Docker-image --conf spark.kubernetes.executor.container.image=docker4tg/Docker-image --conf spark.kubernetes.driver.pod.name=test-run-spark --class [class] --num-executors 1 --executor-memory 512m --driver-memory 512m --driver-cores 2 --executor-cores 2 --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark local:///[jar].jar

Using the same docker image and I'm able to pull the docker image to my local. But the k8s pods are not able to , but here's the cache. Only few tags of the same image, I moved the mkdir command up or down to change the hash, worked. I did not logical changes, But it worked fine for 3 to 4 tags,and the spplication ran successfully. I could not understand this.

Please help me to figure out the issue.

Dockerfile


FROM ubuntu:18.04
ARG SPARKVERSION=tmpsVersion
ARG HADOOPVERSION=tmpHVersion
ENV SPARK_VERSION=$SPARKVERSION
ENV HADOOP_VERSION=$HADOOPVERSION
RUN sed -i s/http/ftp/ /etc/apt/sources.list && apt-get update -y
RUN apt-get install wget -y
RUN apt-get install openjdk-8-jdk -y
RUN sed -i s/http/ftp/ /etc/apt/sources.list && apt-get update -y
RUN mkdir -p /opt/spark/work-dir
WORKDIR /opt/spark/work-dir
RUN wget -O spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz  https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
RUN tar -xzvf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz -C /opt/spark/
RUN rm -rf spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}.tgz
RUN mv -f /opt/spark/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}/* /opt/spark/
RUN rm -rf /opt/spark/spark-${SPARK_VERSION}-bin-hadoop${HADOOP_VERSION}
ENV SPARK_HOME=/opt/spark
ENV PATH="${SPARK_HOME}/bin:${PATH}"
RUN mkdir -p /opt/spark/data-jars/
COPY [jar.jar] /opt/spark/data-jars/
ENTRYPOINT [ "/opt/spark/kubernetes/dockerfiles/spark/entrypoint.sh" ]

ERROR:

kubectl get pods; kubectl logs test-run-spark
NAME             READY   STATUS             RESTARTS   AGE
test-run-spark   0/1     ImagePullBackOff   0          2m36s
Error from server (BadRequest): container "spark-kubernetes-driver" in pod "test-run-spark" is waiting to start: trying and failing to pull image

Kindly help me with this guys


Solution

  • Your minikube environment is isolated from your host, so if you already have the image on your host or you can pull it, it doesn't mean you can do the same thing in minikube.

    If you want to build the image in minikube context:

    # export minikube docker config
    eval $(minikube docker-env)
    # build your image directly in minikube
    docker build 
    

    If you have the image locally, you can load it to minikube by:

    minikube image load IMAGE_NAME
    

    And if you want to let minikube pull the images from a private remote registry (ex: dockerhub), you can follow these instructions to add the registry creds to your minikube.