When we fetch data using the DB API 2.0 cur.execute("select * from db.table")
we get a cursor which seems like a generator object of list of tuples.
Whereas in pymongo, when we fetch we get it as list of dictionaries. I wanted to achieve something like this.
Instead of fetching list of tuples, I wanted list of dictionaries or a named tuple.
I believe from an efficiency point of view it makes sense, since the schema is already defined so no need to send it for every record.
Currently the workaround I am using is:
cur.execute("select * from db.table")
columns = cur.columns_with_types
data = cur.fetchall()
df = pd.DataFrame(data,columns=[tuple[0] for tuple in columns])
data_reqd = df.to_dict('records')
This methods fairs poorly when query returns a lot of data.
Workaround 1: Use fetchmany(size=block_size)
but it doesn't seem like an elegant way to do things.
Workaround 2: This seems like a much better way to handle things.
cur.execute("select * from db.table")
columns = cur.columns_with_types
for tup in cur:
row = dict(zip(columns, tup))
# use row
Any good way to handle this? Any improvements to the question are appreciated.
You can alternatively create a Client
and call its query_dataframe
method.
import clickhouse_driver as ch
ch_client = ch.Client(host='localhost')
df = ch_client.query_dataframe('select * from db.table')
records = df.to_dict('records')