Search code examples
pythonsolrdata-consistency

Solr consistency in python code


I have got python code running on multiple analyzer machines each picking documents from solr(select operations) and modifying data in solr by re-submitting the documents with the updated fields from DB(in case of update/Insert). But since different solr instances on different machines have their own updated documents,this is leading to data inconsistency across the machines.

Is there any way i can keep a central solr document repository which will be queried and updated by different machines,thereby ensuring data consistency?


Solution

  • Solr forums would provide multiple threads on Concurrent Solr add/updates which would give you a clear picture.

    You can maintain a single instance of Solr and have multiple clients commit into it.
    Solr is not transactional like an RDBMS, but it does handle concurrency.
    Whenever is commit is made a a lock is maintained so that others can't commit and are queued.
    A commit can commit all pending commits as well.