Sorry if the title is a little strange - I wasn't sure how to condense my problem into a one-liner appropriately!
Basically, I have a queue of messages on System A, all of which are received through a socket from several instances of System B and processed one by one. Some of these messages modify data in System A's database, which represents the 'global state' (i.e. the state of System A and all of the System B instances).
At the same time, the instances of System B can send 'state request' messages to the queue, which, when processed, return data from System A's database to the requesting System B for further processing. Some operations on one instance of System B depend on the state of one or more other instances of System B.
Obviously, there's a data integrity issue here. As soon as the 'state request' message is processed and the data is returned, there could be any number of unprocessed messages in the queue that modify the global state, rendering the returned data unreliable.
After a lot of thinking, I'm pretty sure that this problem cannot be solved while the global architecture remains the way it is. Is there any way I can restructure the overall system such that this is no longer a problem?
Thanks!
One general approach is to create a single data path through the system, so that there is a fixed upstream-to-downstream data flow, and downstream state can lag upstream, but there's no opportunity for indeterminate ordering (race conditions).
Towards this end, is it possible to rearchitect so that:
1) A pushes (broadcasts) to each B, rather than B polling for state. As you identify, the polling introduces a 2nd path for data and allows race conditions.
2) The "some ops on one B instance depend on other Bs" also sounds like req-reply to communicate between Bs, which introduces alternate data paths and indeterminism. Bs are peers, so there's no obvious upstream among them. But- could the data be striped across Bs such that for any piece of data, one B is the master which pushes updates concerning that datum to the other Bs? For example, B1 is the master for a-m, and B2 for n-z. So for a piece of data "q", the data flow is always A -> B2 -> B1 and deterministic, with A the system of record for the state of the system.
If the Bs are coupled - for example if the a-m data depends on the n-z data, sequence numbers (assigned by A on incoming messages) can help distinguish newer from older states and prevent old data overwriting newer. But the details depend on details of the B-B interactions.
Any eurekas yet?