Search code examples
nestelasticsearch-bulk-api

How can I force NEST to NOT populate request on Bulk API response?


I've been looking all over the place and haven't been able to find a suitable answer to this question. I've created a NEST client using this code:

var myIndex = "myTestIndex";
var myType = "myTestType";

var myClusterUri= "http://localhost:9200";
var uri = new Uri(myClusterUri);
var settings = new ConnectionSettings(uri);
var client = new ElasticClient(settings);

and then later, using this to make a call to the bulk api.

var myJson = PopulateJsonForBulkAPI();
var rawBulkResult = client.Raw.Bulk(myIndex, myType, myJson);

The problem I'm having is that I'm getting an OutOfMemoryException when making the bulk api call. The method that populates myJson creates a HUGE block of JSON but not big enough to throw the exception (but big enough to throw it, if it were duplicated). Then when I make the call to the bulk api it throws the OutOfMemoryException because NEST holds onto the original request (in essence, duplicating the JSON and not having enough memory to hold onto everything). Is there a way to make the call to the Bulk API but tell NEST to NOT hold onto the original request so the huge block of JSON isn't duplicated in memory?

Edit

I'm using NEST version 1.7.2 and ElasticSearch version 1.7.2


Solution

  • In NEST 1.x, the request bytes are always made available on the response but you could write a HttpConnection implementation that doesn't do this, overriding DoSynchronousRequest and DoAsyncRequest.

    If you're getting OutOfMemoryExceptions though, this sounds like you're trying to send too much data in one bulk request. Consider splitting up the data into batches of bulk requests.