Search code examples
apache-sparkhiveqlapache-spark-sql

How to pass multiple statements into Spark SQL HiveContext


For example I have few Hive HQL statements which I want to pass into Spark SQL:

set parquet.compression=SNAPPY;
create table MY_TABLE stored as parquet as select * from ANOTHER_TABLE;
select * from MY_TABLE limit 5;

Following doesn't work:

hiveContext.sql("set parquet.compression=SNAPPY; create table MY_TABLE stored as parquet as select * from ANOTHER_TABLE; select * from MY_TABLE limit 5;")

How to pass the statements into Spark SQL?


Solution

  • Thank you to @SamsonScharfrichter for the answer.

    This will work:

    hiveContext.sql("set spark.sql.parquet.compression.codec=SNAPPY")
    hiveContext.sql("create table MY_TABLE stored as parquet as select * from ANOTHER_TABLE")
    val rs = hiveContext.sql("select * from MY_TABLE limit 5")
    

    Please note that in this particular case instead of parquet.compression key we need to use spark.sql.parquet.compression.codec