Search code examples
c#postgresqltimeoutnpgsql

npgsql timeout during reading attempt


I have a C# function to do a bulk insert in postgresql. What basically is happening here is that i have an array of generic entities, which will be turned into a DataTable, which is then used with a BinaryImport to insert the data into my postgres database. After the bulk insert i read the inserted data back from the database and populate the entities with this data (i use a temporary column do ascertain if a row of the table has been inserted with the last bulk insert).

This worked fine until i have tried to do a bulk insert with approximately 1.000.000 (978.767) rows. (Before i have only used ~ 500.000 rows)

After ~ 4 minutes I get the following Exception:

Npgsql.NpgsqlException: Exception while reading from stream ---> System.TimeoutException: Timeout during reading attempt

The exception occurs on the line while(reader.Read()) of the part where i read back the inserted data.

Here is my code:

public IEnumerable<TEntity> InsertBulkReader<TEntity>(TEntity[] entities)
{
    var table = ClientEntity.GetTable<TEntity>();
    var nonIdColumns = table.Columns.Where(column => !column.IsIdentity);
    var timeStampColumn = new Column() { Name = "TMP_BULK_INSERT_TIME_STAMP", PgType = NpgsqlDbType.TimestampTz };
    //Adding timeStampColumn to the actual table

    using (var connection = new NpgsqlConnection(base.ConnectionString))
    {

        connection.Open();

        using (var transactionScope = TransactionScopeUtil.Create(Transaction.Current))
        {
            var cmd = $"COPY {table.FullName()} ({string.Join(",", nonIdColumns.Concat(new List<Column>() { timeStampColumn }).Select(column => column.Name))}) FROM STDIN BINARY";

            using (var writer = connection.BeginBinaryImport(cmd))
            {
                writer.Timeout = TimeSpan.FromMinutes(10);

                foreach (DataRow row in dataTable.Rows)
                {
                    writer.StartRow();
                    foreach (var column in nonIdColumns)
                    {
                        if (column.PgType == NpgsqlDbType.Unknown || column.PgType == (NpgsqlDbType.Array | NpgsqlDbType.Unknown))
                        {
                            writer.Write(row[column.Name]);
                        }
                        else
                        {
                            writer.Write(row[column.Name], column.PgType);
                        }
                    }
                    writer.Write(row[timeStampColumn.Name], timeStampColumn.PgType);
                }
                writer.Complete();
                writer.Close();
            }

            using (var command = connection.CreateCommand())
            {
                command.CommandTimeout = 0;
        
                command.CommandText = $"SELECT * FROM {table.FullName()} WHERE {timeStampColumn.Name} >= @TIMESTAMP";
                command.Parameters.Add(new NpgsqlParameter() { ParameterName = "TIMESTAMP", Value = timeStampBeforeInsert, });

                using (var reader = command.ExecuteReader())
                {
            
                    while (reader.Read()) //Here the exception is thrown
                    {
                        var entity = new TEntity();

                        foreach (var column in table.Columns)
                        {
                            //Populate entity with values from reader
                        }

                        result.Add(entity);
                    }
                }
            }

            transactionScope.Complete();
            transactionScope.Dispose();
        }
    }
    return result;
}

I have tried the following options in the connectionstring:

CommandTimeout = 0;
InternalCommandTimeout = 0;
KeepAlive = 60 * 10;

I also have added the command.CommandTimeOut = 0; to the command which is used with ExecuteReader() to read the data back.

I have further checked the STATEMENT_TIMEOUT of the postgresql server to ensure the server itself has no timeout.

show STATEMENT_TIMEOUT => "0"

If i do something like select pg_wait(60 * 10) on the server i get no error, to the problem must be with my code. What else can i do to track the problem?


Solution

  • I have executed the query which was running when i got the exception and executed it in my php storm's database plugin. After two seconds the result was shown. But it is paginated so i clicked on "to the last page" button. Then after 30 Seconds i got the following error

    [2023-09-28 15:42:35] Error unmarshaling return header; nested exception is: 
    [2023-09-28 15:42:35] java.net.SocketException: Connection reset
    

    After that i tried it a second time an got another error

    [2023-09-28 15:44:25] Database client process needs more memory to perform the request. 
    [2023-09-28 15:44:25] To configure settings, open 'development_db@localhost' data source properties, go to the 'Advanced' tab, and add 
    [2023-09-28 15:44:25] '-XmxNNNm' to the 'VM options' field, where NNN is the number of megabytes (for example, '-Xmx2048m').
    

    After increasing the memory to 2048mb as mentioned in the message above, the error is gone.

    So the problem was the server running out of resources to handle the request. Sadly this error message didn't get raised as an exception for the NpgsqlAdapter.