Search code examples
scalaapache-sparksbtlog4j

Log4j vulnerability while using Scala and Spark with sbt


I am working on a scala spark project. I am using below dependencies:

 libraryDependencies ++=
    Seq(
      "org.apache.spark" %% "spark-core" % "2.2.0" ,
      "org.apache.spark" %% "spark-sql" % "2.2.0"  ,
      "org.apache.spark" %% "spark-hive" % "2.2.0"
    ),

with scalaVersion set to :

ThisBuild / scalaVersion := "2.11.8"

and i am getting below error:

[error] sbt.librarymanagement.ResolveException: unresolved dependency: org.apache.logging.log4j#log4j-api;2.11.1: Resolution failed several times for dependency: org.apache.logging.log4j#log4j-api;2.11.1 {compile=[compile(*), master(*)], runtime=[runtime(*)]}::
[error]     typesafe-ivy-releases: unable to get resource for org.apache.logging.log4j#log4j-api;2.11.1: res=https://repo.typesafe.com/typesafe/ivy-releases/org.apache.logging.log4j/log4j-api/2.11.1/ivys/ivy.xml: java.io.IOException: Unexpected response code for CONNECT: 403
[error]     sbt-plugin-releases: unable to get resource for org.apache.logging.log4j#log4j-api;2.11.1: res=https://repo.scala-sbt.org/scalasbt/sbt-plugin-releases/org.apache.logging.log4j/log4j-api/2.11.1/ivys/ivy.xml: java.io.IOException: Unexpected response code for CONNECT: 403

Security team has reached out to us to delete the vulnerable log4j-core jar. After which the projects which are using it as transitive dependencies are failing.

Is there a way on just upgrading the log4j version without upgrading scala or spark versions? It should be a way where i can force the compiler to not fetch log4j-core jar of previous version which is vulnerable and in its place can use 2.17.2 version which is not vulnerable.

I have tried :

  dependencyOverrides += "org.apache.logging.log4j" % "log4j-core" % "2.17.2"

also i have excludeAll option in sbt with spark dependencies but both solutions didnt worked out for me.


Solution

  • I just made few updates:

    Added below settings to my sbt project: enter image description here

    Updated below settings to use a newer version: in build.properties and assembly.sbt respectively

    sbt.version=1.6.2
    addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "1.1.0")
    

    Added the log4j dependencies on the top so that any transitive dependency now can use a newer version.

    Given below is the sample snippet of one of my project:

    name := "project name"
    
    version := "0.1"
    
    scalaVersion := "2.11.8"
    
    assemblyJarName in assembly := s"${name.value}-${version.value}.jar"
    
    assemblyShadeRules in assembly := Seq(
      ShadeRule.rename("com.google.**" -> "shaded.@1").inAll
    )
    
    lazy val root = (project in file(".")).settings(
      test in assembly := {}
    )
    
    libraryDependencies += "org.apache.logging.log4j" % "log4j-core" % "2.17.2"
    libraryDependencies += "org.apache.logging.log4j" % "log4j-api" % "2.17.2"
    libraryDependencies += "org.apache.logging.log4j" % "log4j-slf4j-impl" % "2.17.2"
    libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.0" % "provided"
    libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.2.0" % "provided"
    libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.0" % "test"
    libraryDependencies += "com.typesafe" % "config" % "1.3.1"
    libraryDependencies += "org.scalaj" %% "scalaj-http" % "2.4.0"
    
    

    Below should be provided only in case of conflicts between dependencies if there are any:

    assemblyMergeStrategy in assembly := {
        case PathList("META-INF", xs @ _*) => MergeStrategy.discard
        case PathList("org", "slf4j", xs@_*) => MergeStrategy.first  
        case x => MergeStrategy.first
    }