Search code examples
pythonstringtuplespsycopg2

Unable to convert a string tuple into a regular tuple in python


I have a database select output obtained using psycopg2. The output looks like below

result = [('("",Loc,001)',), ('(London,Loc,002)',), ('(Tokyo,Loc,006)',), ('("New York","New Loc",TEST2)',), ('(Berlin,Loc,TEST1)',), ('(Dubai,Loc,TEST4)',)]

This output has been obtained against a distinct query I ran on multiple columns something like

SELECT distinct(col1, col2, col3) from table;

But the output is very complicated to process since some strings inside each tuple don't have a quote.

I tried doing something like below

import ast
formatted_result = [item[0] for item in result]
print(ast.literal_eval(formatted_result))

But it throws the following error

Traceback (most recent call last):
  File "test.py", line 3, in <module>
    print(ast.literal_eval(formatted_result))
  File "/usr/lib/python2.7/ast.py", line 80, in literal_eval
    return _convert(node_or_string)
  File "/usr/lib/python2.7/ast.py", line 79, in _convert
    raise ValueError('malformed string')
ValueError: malformed string

The other option is going though each tuple, split by comma and remove unnecessary characters like below

for item in formatted_result:
    col1 = item.split(",")[0].replace('(', "").replace( ")", "").replace('"', "")
    col2 = item.split(",")[1].replace('(', "").replace( ")", "").replace('"', "")
    col3 = item.split(",")[2].replace('(', "").replace( ")", "").replace('"', "")

What is the proper way to process this output?


Solution

  • Using brackets around your columns is going to return a row-construct, that's why your output looks messed up.

    query = "select distinct(ts, popularity) from table where id < 28050"
    cur.execute(query)
    print(cur.fetchall())
    >>> [('(1640082115,23)',), ('(1640082115,28)',), ...]
    

    Whereas the same query without brackets is going to return the expected result:

    query = "select distinct ts, popularity from table where id < 28050"
    cur.execute(query)
    print(cur.fetchall())
    >>> [(1640082115, 23), (1640082115, 28), (1640082116, 51), ...]