Search code examples
cpostgresqlbatch-processingpostgresql-9.3libpq

Streaming libpq results from PQexec


In attempting to download a large batch of data from Postgres 9.3 in C via libpq's PQexec, I'm wondering if there's a way to stream the result set from the server. From watching top, I can see my memory utilization increasing by roughly the expected result set size when I call PQexec. I want to iterate over the results and "throw away" the results after I've processed them to keep the memory utilization on the processing node down. I couldn't find any mention of such behavior in the docs. Something like psycopg2's fetchmany would be ideal.

Is the only way to "stream" results from libpq by rewriting the query to run in batches?


Solution

  • With a recent enough libpq (libpq-5.5 shipping with PostgreSQL 9.2), you may call PQsetSingleRowMode as described in Retrieving Query Results Row-By-Row. PQexec will have to be replaced by the asynchronous PQsendQuery.

    Otherwise a technique that works with all versions is to open a cursor at the SQL level (see DECLARE...CURSOR) and FETCH from it through successive PQexec calls until completion. This is how FETCH_COUNT is implemented in psql, the command-line tool.
    It also has the advantage that other queries can be executed on the same connection while the cursor is opened and the program is still looping over the results.