I use Spark 2.4.4 and try to get a data frame given below.
val spark = SparkSession
.builder
.master("local[*]")
.appName("App")
.getOrCreate
import spark.sqlContext.implicits._
import spark.implicits._
val justNow = spark.sparkContext.parallelize(
Seq(Row("1", "One")
,Row("2", "Tow")
)
).toDF
I have the above piece of code defined inside main method. But I am getting an error that toDF is not function defined in RDD. I referred other posts on stackoverflow to include the explicits to get rid of the errors. I am still getting it.
error: value toDF is not a member of org.apache.spark.rdd.RDD[org.apache.spark.sql.Row]
possible cause: maybe a semicolon is missing before `value toDF'?
Error occurred in an application involving default arguments.
Can someone please help. Thanks!
You can use the createDataFrame
method instead. toDF is not suitable for RDD of Rows.
import org.apache.spark.sql.types._
import org.apache.spark.sql.Row
val schema = StructType(Seq(StructField("col1",StringType), StructField("col2",StringType)))
val df = spark.createDataFrame(sc.parallelize(Seq(Row("1", "One"),Row("2", "Tow"))), schema)
df.show
+----+----+
|col1|col2|
+----+----+
| 1| One|
| 2| Tow|
+----+----+