Search code examples
hadoophivehql

Why can't hive recognize alias named in select part?


Here's the scenario: When I invoke hql as follows, it tells me that it cannot find alias for u1.

hive> select user as u1, url as u2 from rank_test where u1 != "";
FAILED: SemanticException [Error 10004]: Line 1:50 Invalid table alias or column reference 'u1': (possible column names are: user, url)

This problem is the same as when I try to use count(*) as cnt. Could anyone give me some hint on how to use alias in where clause? Thanks a lot!

hive> select user, count(*) as cnt from rank_test where cnt >= 2 group by user;
FAILED: ParseException line 1:58 missing EOF at 'where' near 'user'

Solution

  • The where clause is evaluated before the select clause, which is why you can't refer to select aliases in your where clause.

    You can however refer to aliases from a derived table.

    select * from (
      select user as u1, url as u2 from rank_test
    ) t1 where u1 <> "";
    
    select * from (
      select user, count(*) as cnt from rank_test group by user
    ) t1 where cnt >= 2;
    

    Side note: a more efficient way to write the last query would be

    select user, count(*) as cnt from rank_test group by user
    having count(*) >= 2
    

    If I remember correctly, you can refer to the alias in having i.e. having cnt >= 2