Search code examples
solrsolrnet

solrnet - solr.Add(doc) bottleneck


Win 7/SolrNet 0.4.0/C# winforms .net 4.0 client.

I am using Solrnet and a winforms threaded application to write multiple Bitmaps and some mathematical descriptors to a Solr instance (on a different server). The interesting thing is solr.Add method seems to slow down the app significantly. i.e. if I comment out the add & commit methods, the CPU utilization jumps to 90% or so, but with them working, CPU utilization is about 20% - however appears the docs are being written to Solr.

Is that expected behavior? would the Solr writes be the bottleneck? How can I get around that?

            var doc = new IndexDocument
            {
                _UUID = Guid.NewGuid().ToString(),
                _FileName = (FileName),
             };

            //// Bitmap is not thread safe we Need to make a copy for each Task and done so synchronously.            
            Bitmap[] blobCopies = MakeBlobCopies(bmpBlob, 2);

            Task<List<KeyValuePair<string, double>>>[] descriptorTasks = new Task<List<KeyValuePair<string, double>>>[2];
            descriptorTasks[0] = Task.Factory.StartNew<List<KeyValuePair<string, double>>>(() => ApplyDescriptor1(blobCopies[0]));
            descriptorTasks[1] = Task.Factory.StartNew<List<KeyValuePair<string, double>>>(() => ApplyDescriptor2(blobCopies[1]));

            Task.WaitAll(descriptorTasks);
            foreach (var t in descriptorTasks)
            {
                List<KeyValuePair<string, double>> flds = t.Result;
                foreach (KeyValuePair<string, double> fld in flds)
                {
                    Type type = doc.GetType();
                    if (!String.IsNullOrEmpty(fld.Key))
                    {
                        SetPropertyValue(doc, fld.Key, fld.Value);
                    }
                }
            }

            DisposeBlobCopies(blobCopies);

            solr.Add(doc);
            solr.Commit();

Solution

  • The solr calls could be the bottleneck in your app as they are blocking io hits. Add and commit are 2 distinct calls to a restful api. I'm almost positive that solrnet does not have native support in their api to do async calls, but it looks like you know how to use the task parallel library. Make the calls non blocking and things should speed up. You may find it necessary to control the amount of concurrency.