I am facing problem while reading data from a Postgres database and inserting into Elastic search using Nest library in an ASP.NET Core console application.
My approach:
It's working for 9 or 10 iterations, but sometime failed in ES.
Also this process is taking a long time, I am talking about only Elastic search insert time.
Errors:
The request was canceled due to the configured HttpClient.Timeout of 60 seconds elapsing
TimeoutException: The operation was canceled.
IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
SocketException: The I/O operation has been aborted because of either a thread exit or an application request.
Could anyone help me to resolve this problem and tell me how we can insert 10+ million records into Elastic Search successfully?
I am unsure if the issue might be with anything else in your solution but this is what works for me:
var chunks = item.Items.Chunk(5000);
foreach (var chunk in chunks)
{
await nestClient.BulkAsync(x => x.Index(item.Index).UpdateMany(chunk, (y, z) => y.IdFrom(z, true).Doc(z).Id(z.Id).RetriesOnConflict(2)));
}
You might want to check the reqiest timeout on you client, mine:
ConnectionSettings settings = new ConnectionSettings(pool)<other settings>.RequestTimeout(TimeSpan.FromMinutes(3))