Search code examples
javascalamavenapache-sparkguava

Google secret manager API and google storage API not working with Apache Spark


Hello fellow StakOverflowers, I am currently working on an Apache Spark-based application written in Scala. I am using Maven as the build tool. The Application was working fine until I had a use case where I need to pull some secrets from the Google Secrets manager. I used the secret manager API for this purpose and used the maven dependency to import the JAR.

The code seems to work fine when I am running it on my local(Intellij) but as soon as I deploy it on my Spark cluster it starts to fail with the following error :

Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common/util/concurrent/internal/InternalFutureFailureAccess
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.ClassLoader.defineClass(Unknown Source)
    at java.base/java.security.SecureClassLoader.defineClass(Unknown Source)
    at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(Unknown Source)
    at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(Unknown Source)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(Unknown Source)
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.ClassLoader.defineClass(Unknown Source)
    at java.base/java.security.SecureClassLoader.defineClass(Unknown Source)
    at java.base/java.net.URLClassLoader.defineClass(Unknown Source)
    at java.base/java.net.URLClassLoader$1.run(Unknown Source)
    at java.base/java.net.URLClassLoader$1.run(Unknown Source)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at java.base/java.net.URLClassLoader.findClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.defineClass1(Native Method)
    at java.base/java.lang.ClassLoader.defineClass(Unknown Source)
    at java.base/java.security.SecureClassLoader.defineClass(Unknown Source)
    at java.base/java.net.URLClassLoader.defineClass(Unknown Source)
    at java.base/java.net.URLClassLoader$1.run(Unknown Source)
    at java.base/java.net.URLClassLoader$1.run(Unknown Source)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at java.base/java.net.URLClassLoader.findClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
    at com.google.api.gax.retrying.ScheduledRetryingExecutor.createFuture(ScheduledRetryingExecutor.java:102)
    at com.google.api.gax.rpc.RetryingCallable.futureCall(RetryingCallable.java:61)
    at com.google.api.gax.rpc.RetryingCallable.futureCall(RetryingCallable.java:41)
    at com.google.api.gax.tracing.TracedUnaryCallable.futureCall(TracedUnaryCallable.java:75)
    at com.google.api.gax.rpc.UnaryCallable$1.futureCall(UnaryCallable.java:126)
    at com.google.api.gax.rpc.UnaryCallable.futureCall(UnaryCallable.java:87)
    at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112)
    at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.accessSecretVersion(SecretManagerServiceClient.java:1185)
    at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.accessSecretVersion(SecretManagerServiceClient.java:1125)
    at cl.falabella.symphony.commons.IngestionOracleSparkApplication$.main(IngestionOracleSparkApplication.scala:104)
    at cl.falabella.symphony.commons.IngestionOracleSparkApplication.main(IngestionOracleSparkApplication.scala)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.google.common.util.concurrent.internal.InternalFutureFailureAccess
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source)
    at java.base/java.lang.ClassLoader.loadClass(Unknown Source)

I understand that this is happening because of a dependency conflict with Guava, so I removed the old guava version from $SPARK_HOME/jars and changed it to guava-30.1.1-JRE, didn't work.

I am attaching my pom.xml here :

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.example.ingestion</groupId>
    <artifactId>fwk-ingestion-spark-extraction-core</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <properties>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
        <encoding>UTF-8</encoding>
        <scala.version>2.12.10</scala.version>
        <scala-maven-plugin.version>4.0.0</scala-maven-plugin.version>
    </properties>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.7</version>
                <configuration>
                    <skipTests>true</skipTests>
                </configuration>
            </plugin>
            <plugin>
                <!-- see http://davidb.github.com/scala-maven-plugin -->
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>4.3.0</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                        <configuration>
                            <args>
                                <!-- <arg>-make:transitive</arg> -->
                                <arg>-dependencyfile</arg>
                                <arg>${project.build.directory}/.scala_dependencies</arg>
                            </args>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>3.3.0</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <archive>
                        <manifest>
                            <mainClass>com.example.symphony.commons.EncryptApplication</mainClass>
                        </manifest>
                    </archive>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <groupId>org.scalatest</groupId>
                <artifactId>scalatest-maven-plugin</artifactId>
                <version>1.0</version>
                <configuration>
                    <reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
                    <junitxml>.</junitxml>
                    <filereports>WDF TestSuite.txt</filereports>
                </configuration>
                <executions>
                    <execution>
                        <id>test</id>
                        <goals>
                            <goal>test</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <groupId>org.jacoco</groupId>
                <artifactId>jacoco-maven-plugin</artifactId>
                <version>0.8.5</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>prepare-agent</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>report</id>
                        <phase>test</phase>
                        <goals>
                            <goal>report</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.2.1</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <transformers>
                                <transformer
                                        implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"/>
                            </transformers>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <shadedArtifactAttached>true</shadedArtifactAttached>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

        </plugins>
    </build>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>com.google.cloud</groupId>
                <artifactId>libraries-bom</artifactId>
                <version>20.8.0</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>

        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scalap</artifactId>
            <version>2.12.10</version>
        </dependency>
        <dependency>
            <groupId>com.typesafe.scala-logging</groupId>
            <artifactId>scala-logging_2.12</artifactId>
            <version>3.9.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-avro_2.12</artifactId>
            <version>3.1.1</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud.bigdataoss</groupId>
            <artifactId>gcs-connector</artifactId>
            <version>hadoop2-1.9.17</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.12</artifactId>
            <version>3.1.1</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud.bigdataoss</groupId>
            <artifactId>bigquery-connector</artifactId>
            <version>hadoop2-0.13.9</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.security</groupId>
            <artifactId>spring-security-core</artifactId>
            <version>5.2.1.RELEASE</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang.modules</groupId>
            <artifactId>scala-xml_2.12</artifactId>
            <version>1.3.0</version>
        </dependency>
        <dependency>
            <groupId>org.scalactic</groupId>
            <artifactId>scalactic_2.12</artifactId>
            <version>3.2.5</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.scalamock</groupId>
            <artifactId>scalamock-scalatest-support_2.12</artifactId>
            <version>3.6.0</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.scalatest</groupId>
            <artifactId>scalatest_2.12</artifactId>
            <version>3.2.2</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.12</artifactId>
            <version>3.1.1</version>
        </dependency>

        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.13.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-catalyst_2.12</artifactId>
            <version>3.1.1</version>
        </dependency>

        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api-scala_2.12</artifactId>
            <version>11.0</version>
        </dependency>
        <dependency>
            <groupId>com.typesafe</groupId>
            <artifactId>config</artifactId>
            <version>1.2.0</version>
        </dependency>
        <dependency>
            <groupId>org.scalacheck</groupId>
            <artifactId>scalacheck_2.12</artifactId>
            <version>1.15.3</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>net.liftweb</groupId>
            <artifactId>lift-json_2.12</artifactId>
            <version>3.4.3</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-storage</artifactId>
        </dependency>
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>8.0.23</version>
        </dependency>
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-secretmanager</artifactId>
        </dependency>
        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.5.13</version>
        </dependency>
        <dependency>
            <groupId>com.google.code.gson</groupId>
            <artifactId>gson</artifactId>
            <version>2.8.7</version>
        </dependency>



    </dependencies>
</project>

Any help or lead will be highly appreciated. Also, I am using the shaded jar.


Solution

  • You need to add an exclusion to google-cloud-secretmanager:

    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>google-cloud-secretmanager</artifactId>
      <exclusions>
        <exclusion>
          <groupId>com.google.api</groupId>
          <artifactId>gax</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    

    and add the following dependency:

    <dependency>
      <groupId>com.google.api</groupId>
      <artifactId>gax</artifactId>
      <version>1.66.0</version>
      <classifier>guavashaded</classifier>
    </dependency>
    

    FWIW I submitted a PR to upgrade guava in Apache Spark sometime ago, but apparently the guava situation in it is complicated by its own dependency issues.