Search code examples
c#apache-sparkparquetmobius

Is there a way to read from Parquet files in hdfs into SqlContext from Mobius?


I know in Scala , you can read in a parquet file as follows:

//Create Spark Context
val sparkConf = new SparkConf().setAppName(appName).setMaster(sparkMaster)
val sc = new SparkContext(sparkConf)
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._

 val pf = 
      sqlContext.read.parquet(hdfsDataUri + "test.parquet")
 pf.registerTempTable("test")

Is there a way to do this using Mobius (C# API for Spark)? I could only find a way to read in CSV files. Ref: https://github.com/Microsoft/Mobius


Solution

  • C# API for using Parquet in Apache Spark is available in Mobius. Following is the C# implementation of the Apache Spark Scala code in your question:

            var sparkConf = new SparkConf().SetAppName(appName).SetMaster(sparkMaster);
            var sc = new SparkContext(sparkConf);
            var sqlContext = new SqlContext(sc);
            var pf = sqlContext.Read().Parquet(hdfsDataUri + "test.parquet");
            pf.RegisterTempTable("test");