I am trying to use the following code to find the last month data in table_1, then left join it with table_2:
import pandas as pd
query = 'select * from table_1 where table_1.ts > "2016-07-12 00:00:00" as recent_table left join table_2 on table_1.t2__fk=table_2.id'
cursor = impala_con.cursor()
cursor.execute('USE my_db')
cursor.execute(query)
df_result = as_pandas(cursor)
df_result
but got the error below:
HiveServer2Error: AnalysisException: Syntax error in line 1:
...s > "2016-07-10 00:00:00" as recent_table left join table_2...
^
Encountered: AS
Expected: AND, BETWEEN, DIV, GROUP, HAVING, ILIKE, IN, IREGEXP, IS, LIKE, LIMIT, NOT, OFFSET, OR, ORDER, REGEXP, RLIKE, UNION
CAUSED BY: Exception: Syntax error
Does anyone know what I missed here? And what's the proper way to achieve this goal. Thanks!
That's cause your query syntax is wrong. You can't use alias for conditional statement as pointed below. Aliases are used only for table name and column name.
where table_1.ts > "2016-07-12 00:00:00" as recent_table
The correct query would be
select t1.*
from table_1 t1
left join table_2 t2 on t1.t2__fk = t2.id
where t1.ts > "2016-07-12 00:00:00";