Search code examples
python-3.xpostgresqlpsycopg2stringio

Bulk insert using PostgreSQL copy_from, psycopg2 and StringIO


I need to insert bunch of lines in Postgres table from python script and am using an advice to use copy_from for performance purpose.

import io
data_io = io.StringIO()

# here I have a loop which is omitted for simplicity
data_io.write("""%s\t%s\n""" % (115, 500))

DB = sql.DB()
DB._db_cur.copy_from(data_io, "temp_prices", columns=('id', 'price'))

In my code I am using a few loops to populate 'data' with values, above is an example.

But the table 'temp_prices' is empty and no errors were thrown

I know the data I needed is written into data_io, because I can see it when using:

print (data_io.getvalue())

What am I doing wrong?


Solution

  • You didn't seek back to the start. Reading and writing in files uses a file position pointer, which advances each time you write to a file, or read back from it. Right now the pointer is at the end of the file object, so reading won't return anything:

    >>> import io
    >>> data_io = io.StringIO()
    >>> data_io.write("""%s\t%s\n""" % (115, 500))
    8
    >>> data_io.read()
    ''
    

    Use data_io.seek(0) to put the file position back at the start before asking copy_from to read your data:

    >>> data_io.seek(0)
    0
    >>> data_io.read()
    '115\t500\n'
    

    On a side note: I'd use the csv module to write tab-separated data to a file:

    import csv
    
    writer = csv.writer(data_io, delimiter='\t')
    writer.writerow((115, 500))