I have got python code running on multiple analyzer machines each picking documents from solr(select operations) and modifying data in solr by re-submitting the documents with the updated fields from DB(in case of update/Insert). But since different solr instances on different machines have their own updated documents,this is leading to data inconsistency across the machines.
Is there any way i can keep a central solr document repository which will be queried and updated by different machines,thereby ensuring data consistency?
Solr forums would provide multiple threads on Concurrent Solr add/updates which would give you a clear picture.
You can maintain a single instance of Solr and have multiple clients commit into it.
Solr is not transactional like an RDBMS, but it does handle concurrency.
Whenever is commit is made a a lock is maintained so that others can't commit and are queued.
A commit can commit all pending commits as well.