Search code examples
sqlscalaapache-sparkhive

Change code window function from scala to sql


I'm trying to change the written scala code to sql code, now I got it, which I think is not entirely correct, since there are two window functions in the scale code and I combined them into one. I also need this code to be in the form of a cte table(Hive sql version).

Scala code:

val window1 = Window.partitionBy("rating_agency","fininstid").orderBy(desc("dt"))

   val table1= table2.where(
      col("rating_agency").isInCollection(List("Moody's", "Standard & Poor's", "Fitch Ratings"))
      and col("rating_object_type")==="Company"
    ).withColumn("rn", rank().over(window1)).where(col("rn") === 1).drop("rn")

Hive SQL code I tried:

table1 AS (
SELECT a.*, 
FROM
    (SELECT m.*,
            RANK() OVER (PARTITION BY rating_agency, fininstid  ORDER BY dt DESC) AS rn
    FROM table2 m
    WHERE rating_agency IN ('Moody''s', 'Standard & Poor''s', 'Fitch Ratings')  AND rating_object_type = 'Company'
    ) a
WHERE a.rn = 1
)

Solution

  • Your code is almost correct, except the unneccessary ',' after 'a.*', which would cause compile error. Assume table2 has 3 columns (col_a, col_b, col_c), the hive SQL can be written as

    table1 AS (
    SELECT a.col_a, a.col_b, a.col_c 
    from
    (SELECT m.col_a,m.col_b,m.col_c,
            RANK() OVER (PARTITION BY rating_agency, fininstid  ORDER BY dt DESC) AS rn
        FROM table2 m
        WHERE rating_agency IN ('Moody''s', 'Standard & Poor''s', 'Fitch Ratings')  AND rating_object_type = 'Company'
    ) a
    WHERE a.rn = 1
    )