Question: How can I get "guarantee commits" with Apache SOLR where persisting data to disk and visibility are both equally important ?
Background: We have a website which requires high end search functionality for machine learning and also requires guaranteed commit for financial transaction. We just want to SOLR as our only datastore to keep things simple and do not want to use another database on the side.
I can't seem to find any answer to this question. The simplest solution for a financial transaction seems to be to periodically query SOLR for the record after it has been persisted but this can have longer wait time or is there a better solution ?
Can anyone please suggest a solution for achieving "guaranteed commits" with SOLR ?
As you were told on the mailing list, Solr does not have transactions. If you index from a dozen clients, and a commit happens from somewhere (either autoSoftCommit, commitWithin on the udpate request, or an explicit commit from one of those dozen clients), all of the documents indexed by those dozen clients will be visible to all searchers.
With a transactional database, each of the dozen clients that is sending updates would have to issue a commit, which would only make the changes made by that specific client visible.
Solr usually does not make any guarantees regarding commits. If you issue ten commits in parallel, that will most likely exceed the maxWarmingSearchers configuration, which is typically set to 2. Most of those ten commits wouldn't actually create a new searcher, which is what makes new documents visible.
If you do manual commits in such a way that you are never exceeding maxWarmingSearchers, then when that commit finishes without error, you can take that as a sign that all changes are now visible.