Search code examples
.netpostgresqlnpgsql

How to stop NpgsqlDataReader from blocking?


Running the following code against a large PostgreSQL table, the NpgsqlDataReader object blocks until all data is fetched.

NpgsqlCommand cmd = new NpgsqlCommand(strQuery, _conn);
NpgsqlDataReader reader = cmd.ExecuteReader(); // <-- takes 30 seconds

How can I get it to behave such that it doesn't prefetch all the data? I want to step through the resultset row by row without having it fetch all 15 GB into memory at once.

I know there were issues with this sort of thing in Npgsql 1.x but I'm on 2.0. This is against a PostgreSQL 8.3 database on XP/Vista/7. I also don't have any funky "force Npgsql to prefetch" stuff in my connection string. I'm at a complete loss for why this is happening.


Solution

  • I'm surprised the driver doesn't provide a way to do this-- but you could manually execute the SQL statements to declare a cursor, open it and fetch from it in batches. i.e. (and this code is very dubious as I'm not a C# guy):

    new PgsqlCommand("DECLARE cur_data NO SCROLL CURSOR AS "
                     + strQuery, _conn).ExecuteNonQuery();
    do {
       NpgsqlDataReader reader = new NpgsqlCommand("FETCH 100 FROM cur_data", _conn)
                                               .ExecuteReader();
       int rows = 0;
       // read data from reader, incrementing "rows" for each row
    } while (rows > 0);
    new PgsqlCommand("CLOSE cur_data", _conn).ExecuteNonQuery();
    

    Note that:

    • You need to be inside a transaction block to use a cursor, unless you specify the "HOLD" option when declaring it, in which case the server will spool the results to a server-side temp file (you just won't have to transfer it all at once though)
    • The cursor_tuple_fraction setting may cause a different plan to be used when executing a query via a cursor as opposed to in immediate mode. You may want to do "SET cursor_tuple_fraction=1" just before declaring the cursor since you're actually intending to fetch all the cursor's output.