Search code examples
pythonsqlimpala

Impala/SQL: select a sub-table and join


I am trying to use the following code to find the last month data in table_1, then left join it with table_2:

import pandas as pd

query = 'select * from table_1 where table_1.ts > "2016-07-12 00:00:00" as recent_table left join table_2 on table_1.t2__fk=table_2.id' 

cursor = impala_con.cursor()
cursor.execute('USE my_db')
cursor.execute(query)
df_result = as_pandas(cursor)
df_result

but got the error below:

HiveServer2Error: AnalysisException: Syntax error in line 1:
...s > "2016-07-10 00:00:00" as recent_table left join table_2...
                             ^
Encountered: AS
Expected: AND, BETWEEN, DIV, GROUP, HAVING, ILIKE, IN, IREGEXP, IS, LIKE, LIMIT, NOT, OFFSET, OR, ORDER, REGEXP, RLIKE, UNION

CAUSED BY: Exception: Syntax error

Does anyone know what I missed here? And what's the proper way to achieve this goal. Thanks!


Solution

  • That's cause your query syntax is wrong. You can't use alias for conditional statement as pointed below. Aliases are used only for table name and column name.

    where table_1.ts > "2016-07-12 00:00:00" as recent_table
    

    The correct query would be

    select t1.* 
    from table_1 t1
    left join table_2 t2 on t1.t2__fk = t2.id
    where t1.ts > "2016-07-12 00:00:00";