Search code examples
pythonpandasasyncpg

How to use asyncpg.copy_to_table with Pandas Dataframe


I'm trying to write a large Pandas dataframe to postgres with asyncpg but I'm getting errors when trying to do it with the copy_to_table function.

I have working code using psycopg2.copy_from using StringIO but it's not working when I try to implement a similar pattern with asyncpg

Using StringIO

sio = StringIO(df.to_csv(index=None, header=None))
sio.seek(0)
async with pg_pool.acquire() as conn:
    async with conn.transaction():
        s = await conn.copy_to_table('tmp_table', source=sio, columns=list(df.columns), delimiter=',')

This is the error I get using StringIO:

Exception:  memoryview: a bytes-like object is required, not 'str'

I also tried loading the dataframe into a BytesIO object but I'm getting a different issue from to_csv:

bio = BytesIO(df.to_csv(index=None, header=None))
bio.seek(0)

TypeError: a bytes-like object is required, not 'str'

I'm pretty sure I'm converting the dataframe to bytes wrong here. Either way I just want to use asyncpg to load a large dataframe into postgres via the COPY command - not row by row.


Solution

  • I was over-complicating things for myself. copy_records_to_table works - just convert the data to tuples.

    tuples = [tuple(x) for x in df.values]
    
    s = await conn.copy_records_to_table(table_name, records=tuples, columns=list(df.columns), timeout=10)