Search code examples
apache-sparkspark-streamingtwitter-streaming-api

Spark 2.0.0 twitter streaming driver is no longer available


During migration from spark 1.6.2 to spark 2.0.0 appeared that package org.apache.spark.streaming.twitter has been removed and twitter streaming is no longer available as well as dependency

<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-streaming-twitter_2.11</artifactId>
  <version>2.0.0</version>
</dependency>

Can anyone suggest how to procced twitter stream in new spark?


Solution

  • Twitter (and some other) driver support has been removed in Spark 2.0.

    You can see it in the removal section of the Release Notes:

    Removals

    The following features have been removed in Spark 2.0:

    • Less frequently used streaming connectors, including Twitter, Akka, MQTT, ZeroMQ

    They have been extracted as a separate package under the Bahir Project. The twitter extension, streaming-twitter, can be found via:

    sbt:

    libraryDependencies += "org.apache.bahir" %% "spark-streaming-twitter" % "2.0.0"
    

    Maven:

    <dependency>
      <groupId>org.apache.bahir</groupId>
      <artifactId>spark-streaming-twitter_2.11</artifactId>
      <version>2.0.0-preview</version>
    </dependency>
    

    More on that (thanks to @IvanShulak) in the Mailing List

    Edit:

    For Spark 2.0.1, use:

    libraryDependencies += "org.apache.bahir" %% "spark-streaming-twitter" % "2.0.1"