Search code examples
hiveimpalahue

Use result of a Hive/Impala query inside another query


I'm working with hive/impala and I often run into the need of having to query the results of a show partition to get specific partition. Let's suppose I have a table tbl1 partitioned by fields country and date. So, show partitions tbl1 would result in something like this

country=c1/date=d1
country=c1/date=d3
country=c2/date=d2

I want to do something like select * from (show partitions tbl1) a where a.country='c1' and I want to do this in Hue or shell (hive and impala). Is this possible?


Solution

  • If someone ever finds this useful, this is what I ended up doing. Assuming you have spark-shell or spark2-shell, you can store the output of show partitions in a dataframe and then transform such dataframe. This is what I did (inside spark2-shell:

    val df = spark.sql("show partitions tbl1").map(row => {
        val arrayValues = row.getString(0).split("/")
        (arrayValues.head.split("=")(1), arrayValues(1).split("=")(1))
    }).toDF("country", "date")
    

    this takes the list of partitions (a DataFrame[String]) and splits the dataframe by / and then for each piece, splits for = and takes the value