My application intends to insert documents to Solr regularly. There are two considerations:
I found ConcurrentUpdateSolrClient
is a candidate solution that it is thread-safe, and it has a queue to buffer and flush many documents in one connection. But I am confused when I test it. My question is,
The SolrClient
is thread-safe and you can share a SolrClient
instance across multiple threads if your insert/update/delete are stick to one collection or core into the Solr instance.
But Solr hasn't the transactions as you could imagine have in a classic RDBMS.
You must be aware that if you have more SolrClient
instances (in the same app or in different apps and servers) that concurrently updates a collection/core, the first client that sends a commit to that collection/core, commits all the updates done till that moment by every client.
On the other hand, if a SolrClient
instances sends a rollback, it rollbacks all the updates done (even by the other SolrClient clients).
There are many strategies to updates concurrently documents in Solr, and to understand how the commit works in Solr I warmly recommend to read
And if you're writing your own multithread application I have just to recommend to centralise the commits and rollbacks in one point.
ConcurrentUpdateSolrClient
buffers all added documents and writes them into open HTTP connections. This class is thread safe.Although any SolrClient request can be made with this implementation, it is only recommended to use ConcurrentUpdateSolrClient with /update requests. The class HttpSolrClient is better suited for the query interface.