Search code examples
hadoophivehql

Unable to get a expected output using hive aggregate function


I have a created a table (movies) in Hive as below(id,name,year,rating,views)

1,The Nightmare Before Christmas,1993,3.9,4568 2,The Mummy,1932,3.5,4388 3,Orphans of the Storm,1921,3.2,9062 4,The Object of Beauty,1991,2.8,6150 5,Night Tide,1963,2.8,5126 6,One Magic Christmas,1985,3.8,5333 7,Muriel's Wedding,1994,3.5,6323 8,Mother's Boys,1994,3.4,5733 9,Nosferatu: Original Version,1929,3.5,5651 10,Nick of Time,1995,3.4,5333

I want to write a hive query to get the name of the movie with highest views.

select name,max(views) from movies;

but it gives me an error FAILED: Error in semantic analysis: Line 1:7 Expression not in GROUP BY key name

but doing a group by with name gives me the complete list (which is expected). What changes should I make to my query?


Solution

  • After little bit of digging, I found out that the answer is not so straightforward as we do in SQL. Below query gives the expected result.

    select a.name,a.views from movies a left semi join(select max(views) views from movies)b on (a.views=b.views);