I have a docker image with a script which reads an iql file and executes it in impala. Removing the file reference and running a simple script works fine (ie SELECT 1;), but whenever I try to execute the script from a file I get an error
pyodbc.Error: ('HY000', '[HY000] [Cloudera][ImpalaODBC] (110) Error while executing a query in Impala: [HY000] : ParseException: Syntax error in line 2:\n\n^\nEncountered: EOF\nExpected: ALTER, COMMENT, COMPUTE, COPY, CREATE, DELETE, DESCRIBE, DROP, EXPLAIN, GRANT, INSERT, INVALIDATE, LOAD, REFRESH, REVOKE, SELECT, SET, SHOW, TRUNCATE, UPDATE, UPSERT, USE, VALUES, WITH\n\nCAUSED BY: Exception: Syntax error\n (110) (SQLExecDirectW)'
I have tried removing DriverManagerEncoding=UTF-16 from the cloudera.impalaodbc.ini file and various types of sql query (including changing the file extension to isql, iusql) but always get the same error.
Any ideas?
Python function:
def process_data_to_impala():
conn = pyodbc.connect(DSN='some_DSN', autocommit=True)
crsr = conn.cursor()
with open('/path/to/sql/file.iql','r') as inserts:
for statement in inserts:
crsr.execute(statement)
print(crsr.fetchall())
The problem relates to the way the python code is looping through the statement and the way ';' ends the query.
A better way to do it for a single query in a file is
with open('file/location.iql') as script:
statement = script.read()
print(statement)
crsr = conn.cursor()
crsr.execute(statement)
And when you have multiple queries in a file and you need to loop:
with open('file/location.iql') as script:
sqlscript = script.read()
for statement in sqlscript.split(';'):
print(statement)
crsr = conn.cursor()
crsr.execute(statement)
When reading multiple queries from one script (ie Drop table, create table) you must remove ';' from the last query. This is because the split creates a final blank query after the last ';'. See this question for more info.
I hope this helps somebody in the future!