Search code examples
apache-sparkjarhadoop-yarnamazon-emrlivy

Getting import error while executing statements via livy sessions with EMR


I am trying to post statements to livy session with EMR 6.1.0. But i am unable to import the class(to my custom jar) which i am trying to execute.

Statement I am trying to post to a livy session -

   import com.path.to.Compactor 
   Compactor.compact(x, y, z) 

Compactor class is present in small-file-compactor-lib-1.0-SNAPSHOT-all.jar

This is the error i am getting -

<console>:23: error: object path is not a member of package com
        import com.path.to.Compactor

When i try with spark-shell --jars small-file-compactor-lib-1.0-SNAPSHOT-all.jar the above code works fine.

I have tried passing this jar in jars argument in the livy REST API while creating the session and the application logs also suggests that its getting picked up and uploaded to hdfs.

Initially i was saving the jar in S3 and passing s3 link in the api. Then, i have tried putting in hdfs and then even tried to put the jar in /usr/lib/livy/repl_2.12-jars/ directory so that it gets uploaded along with other jars. But that also didn't solve the import problem.

I have looked into the working directory of spark and the jar is present.

I have posted this println(sc.jars) statement and i get this

ArrayBuffer(file:/usr/lib/livy/rsc-jars/livy-api-0.7.0-incubating.jar, file:/usr/lib/livy/rsc-jars/livy-rsc-0.7.0-incubating.jar, file:/usr/lib/livy/rsc-jars/livy-thriftserver-session-0.7.0-incubating.jar, file:/usr/lib/livy/rsc-jars/netty-all-4.1.17.Final.jar, hdfs:///user/livy/small-file-compactor-lib-1.0-SNAPSHOT-all.jar, file:/usr/lib/livy/repl_2.12-jars/commons-codec-1.9.jar, file:/usr/lib/livy/repl_2.12-jars/livy-core_2.12-0.7.0-incubating.jar, file:/usr/lib/livy/repl_2.12-jars/livy-repl_2.12-0.7.0-incubating.jar, file:/usr/lib/livy/repl_2.12-jars/small-file-compactor-lib-1.0-SNAPSHOT-all.jar)

My jar is present here^

But why is spark not able to import the class?


Solution

  • Look like there is a bug with livy version of EMR 6.1.0, It does load the application classes to the JVM.

    There is a workaround to solve this problem. You can use reflection.

    {
        "kind": "spark",
        "code": "Thread.currentThread.getContextClassLoader.loadClass(\"com.path.to.Compactor\").getMethod(\"compact\", classOf[String], classOf[String], classOf[String]).invoke(null, Array(\"input 1\", \"input 2\", \"input 3\"))"
    }