I'm new in Scala
Here's what I'm trying to understand
This code snippet gives me RDD[Int], not give option to use toDF
var input = spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9))
But when I import import spark.sqlContext.implicits._
, it gives me an option to use toDF
import spark.sqlContext.implicits._
var input = spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9)).toDF
So I looked into the source code, implicits
is present in SQLContext
class as object
. I'm not able to understand, how come RDD
instance is able to call toDF
after import ?
Can anyone help me to understand ?
update
found below code snippet in SQLContext class
object implicits extends SQLImplicits with Serializable {
protected override def _sqlContext: SQLContext = self
}
toDF
is an extension method. With the import you bring necessary implicits to the scope.
For example Int
doesn't have method foo
1.foo() // doesn't compile
But if you define an extension method and import implicit
object implicits {
implicit class IntOps(i: Int) {
def foo() = println("foo")
}
}
import implicits._
1.foo() // compiles
The compiler transforms 1.foo()
into new IntOps(1).foo()
.
Similarly,
object implicits extends SQLImplicits ...
abstract class SQLImplicits ... {
...
implicit def rddToDatasetHolder[T : Encoder](rdd: RDD[T]): DatasetHolder[T] = {
DatasetHolder(_sqlContext.createDataset(rdd))
}
implicit def localSeqToDatasetHolder[T : Encoder](s: Seq[T]): DatasetHolder[T] = {
DatasetHolder(_sqlContext.createDataset(s))
}
}
case class DatasetHolder[T] private[sql](private val ds: Dataset[T]) {
def toDS(): Dataset[T] = ds
def toDF(): DataFrame = ds.toDF()
def toDF(colNames: String*): DataFrame = ds.toDF(colNames : _*)
}
import spark.sqlContext.implicits._
transforms spark.sparkContext.parallelize(List(1,2,3,4,5,6,7,8,9)).toDF
into rddToDatasetHolder(spark.sparkContext.parallelize...).toDF
i.e. DatasetHolder(_sqlContext.createDataset(spark.sparkContext.parallelize...)).toDF
.
You can read about implicits, extension methods in Scala
Understanding implicit in Scala
Where does Scala look for implicits?
Understand Scala Implicit classes
https://docs.scala-lang.org/overviews/core/implicit-classes.html
https://docs.scala-lang.org/scala3/book/ca-extension-methods.html
https://docs.scala-lang.org/scala3/reference/contextual/extension-methods.html
How extend a class is diff from implicit class?
About spark.implicits._
Importing spark.implicits._ in scala
What is imported with spark.implicits._?
import implicit conversions without instance of SparkSession