Search code examples
sqlhadoopapache-hive

trying to find the max of select statement in HIVE


I am trying to yield a the top person by weight in the below script. I have a working version way below which returns Matt Holiday with 250 as weight, and now that is all i want The player with Max weight and him only not anyone else

SELECT DISTINCT n.fname, n.lname, MAX(n.weight) FROM master n 
JOIN (SELECT b.id as id, b.year as year, b.triples as triples FROM batting b 
WHERE year == 2005 AND triples > 5) x
ON x.id = n.id
ORDER BY n.weight DESC;

now this comes up with an error like this

Failed: Semantic Exception [Error 10128]: Line 4:34 Not yet supported place for UDAF 'MAX'

However this script returns what i expected, output below

SELECT DISTINCT n.fname, n.lname, n.weight FROM master n 
JOIN (SELECT b.id as id, b.year as year, b.triples as triples FROM batting b 
WHERE year == 2005 AND triples > 5) x
ON x.id = n.id
ORDER BY n.weight DESC;

output

Matt Holiday 250
Bill Dickey 205
Bob Feller 200
Tom Glavine 190

Solution

  • you have an aggregation function and in order to get the result you want you need to use group by

    SELECT n.fname, n.lname, MAX(n.weight) FROM master n 
    JOIN (SELECT b.id as id, b.year as year, b.triples as triples FROM batting b 
    WHERE year == 2005 AND triples > 5) x
    ON x.id = n.id
    GROUP BY n.fname,n.lname
    ORDER BY n.weight DESC
    LIMIT 1;
    

    Parameters or Arguments

    SELECT expression1, expression2, ... expression_n, 
           aggregate_function (aggregate_expression)
    FROM tables
    [WHERE conditions]
    GROUP BY expression1, expression2, ... expression_n;
    

    expression1, expression2, ... expression_n Expressions that are not encapsulated within an aggregate function and must be included in the GROUP BY Clause at the end of the SQL statement http://www.techonthenet.com/sql/group_by.php

    this is probably because the same rule exist in HiveQL as well