I am busy upgrading an old system that uses a Postgres database v9 to v14, I need to update the Npgsql library from its current version 2.2.7 to 4.1.12 on a C# .NetFramework 4.6.1 application.
In my tests, I have noticed that a command that executes a function and reads the response into an array using NpgsqlDataReader takes 10 times longer if I use Npgsql v3 or higher.
The Code looks as follow the while (dbReader.Read()) loop: Running Npgsql v2.2.7 takes 10 seconds to run through 20000 rows when connected to the Postgres v9 database. Running Npgsql v3 or higher takes 140 seconds to run through 20000 rows when connected to the Postgres v9 or v14 database (the db does not make a difference).
using Npgsql;
using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.Data;
namespace UpgradeV2.Sample
{
internal class SampleRequest
{
private static string connectionstring = "server=myhostname;port=5432;database=mydatabase;user id=myuser;password=mypassword;enlist=true;pooling=false;minpoolsize=1;maxpoolsize=100;timeout=1024;commandtimeout=3000;";
public static IEnumerable<DataClass> GetData(long id)
{
var parameters = new Dictionary<string, object>
{
{"i_id", (int)id}
};
using (NpgsqlConnection _npgsqlConnection = new NpgsqlConnection(connectionstring))
{
var command = new NpgsqlCommand
{
CommandType = CommandType.StoredProcedure,
Connection = _npgsqlConnection,
CommandText = "public.function_returning_lots_of_data"
};
foreach (var parameter in parameters)
{
command.Parameters.Add(new NpgsqlParameter { ParameterName = parameter.Key, Value = parameter.Value });
}
using (var dbReader = command.ExecuteReader())
{
var dataList = new Collection<DataClass>();
// to complete this while loop with npgsql v2 takes 10 seconds
// to complete this while loop with npgsql v3-v6 takes 140 seconds
while (dbReader.Read())
{
if (dbReader["id"] == DBNull.Value)
{
continue;
}
DataClass item = new DataClass()
{
id = dbReader.GetFieldValue<int>("id"),
//And about 20 other Fields read using GetFieldValue
};
dataList.Add(item);
}
return dataList;
}
}
}
}
}
I tried different newer versions of the Npgsql library and it all results in the same problem from version 3 onwards. I tried updating the connection string to a higher value for "Read Buffer Size" and it does not have any improvement I tried using the ExecuteAsync and ReadAsync methods, which have little to no effect.
Newer Npgsql versions are very unlikely to actually be slower than 2.2.7; you haven't posted the complete code, so I'm going to assume that there's some sort of problem with the way you're measuring. For example, it's very possible that the first Npgsql operation is slow, since e.g. Npgsql needs to load type information from the database; so if you're just measuring the time for a single request, that's going to be very skewed.
I highly recommend writing up a proper benchmark with BenchmarkDotNet; you just have to write a single function, and that library handles all the actual benchmarking for you (e.g. it runs it in a warmup phase before starting a measurement).
If you can submit a BenchmarkDotNet benchmark that shows 2.2.7 actually performing significantly better than modern versions, I'd definitely want to look at it. I also suggest using recent versions (e.g. 8.0) and not 4.x, which itself is very old by now.