Search code examples
multithreadingsolrapache-sparkdistributed-computing

Executing huge number of faceted date queries in SOLR in a distributed manner


One of the projects i work on require the execution of large number of Date faceted queries on SOLR as fast as possible.

Please can you suggest suitable methods for this.

I was exploring the spark-solr library to send multiple parallel queries to solr via Spark. But not sure if that is the best approach.


Solution

  • Each Solr search is its own thread, so issuing concurrent requests is an established way to increase throughput. Date faceting (or really range faceting) in Solr is reliant on the FilterCache for performance, so make sure that is large enough; in your case it should be a bit more than the number of buckets in your facet setup. You can inspect the cache state through the Solr admin interface, to ensure that the number of evictions is low.