Search code examples
scalatwitterapache-sparkstreaming

Unable to import org.apache.spark.streaming.twitter in Spark Scala


The following import is failing to compile in SBT

import org.apache.spark.streaming.twitter._

[error] /home/hduser/workspace/TweetStream/src/main/scala/TweetStream.scala:8: object twitter is not a member of package org.apache.spark.streaming
[error] import org.apache.spark.streaming.twitter._
[error]  

And the following subsequently as well

val tweetStream = TwitterUtils.createStream(ssc, None, filters, StorageLevel.MEMORY_ONLY_SER_2).map(gson.toJson(_))


[error] /home/hduser/workspace/TweetStream/src/main/scala/TweetStream.scala:36: not found: value TwitterUtils
[error]     val tweetStream = TwitterUtils.createStream(ssc, None, filters, StorageLevel.MEMORY_ONLY_SER_2).map(gson.toJson(_))
[error]                       ^
                                 ^

The build.sbt is the following passes all dependency resolution

name := "TweetStream"
version := "1.0"
scalaVersion := "2.11.7"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.2" 
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.5.2"
libraryDependencies += "org.apache.spark" % "spark-streaming_2.11" % "1.5.2"
libraryDependencies += "com.google.code.gson" % "gson" % "2.7"
libraryDependencies += "org.twitter4j" % "twitter4j-core" % "4.0.4"

Have I added the wrong dependency?


Solution

  • Here's the build.sbt ...

    lazy val root = (project in file(".")).
      settings(
        name := "TweetStream",
        version := "1.0",
        scalaVersion := "2.11.7",
        mainClass in Compile := Some("TweetStream")        
      )
    
    libraryDependencies ++= Seq(
      "org.apache.spark" %% "spark-core" % "1.5.2",
      "org.apache.spark" %% "spark-streaming" % "1.5.2",
      "org.apache.spark" % "spark-streaming-twitter_2.11" % "1.5.2",
      "com.google.code.gson" % "gson" % "2.7",
      "org.twitter4j" % "twitter4j-core" % "3.0.3",
      "org.twitter4j" % "twitter4j-stream" % "3.0.3"
    )
    
    // META-INF discarding
    mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
       {
        case PathList("META-INF", xs @ _*) => MergeStrategy.discard
        case x => MergeStrategy.first
       }
    }
    

    And the assembly.sbt in the project subfolder

    addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")