apache-sparkkubernetesapache-zeppelin

Set a new name for the Spark executor pod in Kubernetes


I am using Apache Zeppelin and Apache Spark on Kubernetes. After creating the Spark interpreter Pod, which works as a Spark driver, it attempts to launch 2 executors. However, in the pod logs, I'm encountering an error:

Exception when notifying snapshot subscriber.

java.lang.stringindexoutofboundsexception: string index out of range: 63
...

From my understanding, this error is related to the long name of the pod. Therefore, I need to configure new name for the Spark executor pod in Kubernetes. I would appreciate your assistance in resolving this issue!

I expect that after changing the names of the pods, the error will be resolved. It's possible that there might be an alternative solution to address it as well.


Solution

  • The name of the spark executor is by default based on the application name and a kubernetes generated id that ensure unicity in the cluster + -exec-$id where $id is the identifier of the executor defined by spark.

    Basically, you need to have an app name + kubernetes generated id size <= 47 because -exec-[0-9]{1,10} will be always there, and has 16 characters, and total size accepted is 63.

    So, if you have an app named foo, the executors pod names will be something like:

    foo-de0d85892012de3b-exec-1

    foo-de0d85892012de3b-exec-2 ...

    Now, this name should not be higher than 63 characters, so either:

    • rename the app (spark.app.name) so the app name + kubernetes generated id won't be higher than 47
    • use the option spark.kubernetes.executor.podNamePrefix in order to set a prefix yourself without leaving it to kubernetes, then the name will be podNamePrefix-exec-[0-9]{1,10}

    Note that the second option could be tricky because you need to ensure that there is a unique kind of job in the cluster or you need to implement the logic of unicity on your side.

    See more https://spark.apache.org/docs/latest/running-on-kubernetes.html#configuration