I have a database select output obtained using psycopg2
. The output looks like below
result = [('("",Loc,001)',), ('(London,Loc,002)',), ('(Tokyo,Loc,006)',), ('("New York","New Loc",TEST2)',), ('(Berlin,Loc,TEST1)',), ('(Dubai,Loc,TEST4)',)]
This output has been obtained against a distinct query I ran on multiple columns something like
SELECT distinct(col1, col2, col3) from table;
But the output is very complicated to process since some strings inside each tuple don't have a quote.
I tried doing something like below
import ast
formatted_result = [item[0] for item in result]
print(ast.literal_eval(formatted_result))
But it throws the following error
Traceback (most recent call last):
File "test.py", line 3, in <module>
print(ast.literal_eval(formatted_result))
File "/usr/lib/python2.7/ast.py", line 80, in literal_eval
return _convert(node_or_string)
File "/usr/lib/python2.7/ast.py", line 79, in _convert
raise ValueError('malformed string')
ValueError: malformed string
The other option is going though each tuple, split by comma and remove unnecessary characters like below
for item in formatted_result:
col1 = item.split(",")[0].replace('(', "").replace( ")", "").replace('"', "")
col2 = item.split(",")[1].replace('(', "").replace( ")", "").replace('"', "")
col3 = item.split(",")[2].replace('(', "").replace( ")", "").replace('"', "")
What is the proper way to process this output?
Using brackets around your columns is going to return a row-construct
, that's why your output looks messed up.
query = "select distinct(ts, popularity) from table where id < 28050"
cur.execute(query)
print(cur.fetchall())
>>> [('(1640082115,23)',), ('(1640082115,28)',), ...]
Whereas the same query without brackets is going to return the expected result:
query = "select distinct ts, popularity from table where id < 28050"
cur.execute(query)
print(cur.fetchall())
>>> [(1640082115, 23), (1640082115, 28), (1640082116, 51), ...]