Search code examples
synchronizationeaiapplication-integration

What's the best way of synchronizing data between decoupled systems?


I have let's say 2 (but they'll become more in the future) fully decoupled systems: system A and system B.

Let's say every piece of information on each system has an informationID. There's nothing stopping the informationID to be the same on different systems. What univocally identifies a piece of information across all systems is a Source-informationID pair.

Let's say I need to export a piece of information from System A to system B. I then want to export the same piece of information from System B and re-import it into System A and I need to be able to recognize that's the same piece of information.

What's the best way of doing this in people's experience?

This is what I am thinking to do:

  1. Setup a message bus between the systems with message queues.
  2. Setup endpoints for each system that will monitor changes and generate commands wrapped into messages that will be pumped into queues (for example when a piece of information is created/deleted/updated).
  3. Assign ranks to the endpoints relative to create/delete/update commands in order no to rely on system names but only on a general hierarchy - so that each system doesn't need to know about the others.
  4. Assign a treshold on update/delete/create command to each endpoint so that commands not meeting the treshold requirement will be filtered out and not processed

This won't solve the fact that I still need to carry around originalSource+originalSourceID though.

Any help appreciated.


Solution

  • This problem has been addressed by EAI (Enterprise Application Integration) vendors like Tibco and webMethods (now part of Software AG). I've never used Tibco before, but I've used webMethods to solve these kind of problems so I'll just focus on webmethods. For example, in an enterprise, data about employees could reside in both Active Directory and PeopleSoft. webMethods could be used to ensure changes, additions, deletes in one system (application) will be reflected in the other in real time. In some other organization, data about employees could also be in an Oracle or SQL Server database. Again, not a problem. These EAI tools like webMethods can talk to a wide variety of back-ends. webMethods is not limited to a single source and a single target, but because it has a publish-subscribe architecture, data from a single source can flow to multiple interested targets who subscribe to a particular piece of information. Guaranteed delivery and may other features can be found in these products. Back to the employee example, ultimately if one does it right, at any given time, all systems and applications in an enterprise can contain the same information about the employees without any discrepancy.

    So instead of doing programming in C# or Java, you'll be doing webMethods programming which is very much like a 4GL language. I call it programming because there are still logic involved, loop, if then else, branch, variables, packages, etc but it's very procedure oriented, i.e. no concept of OOP at all.

    These EAI tools are built with limited purposes in mind and one of the purposes is to synchronize data between disparate systems in an enterprise easily. And they do their job very well.

    The drawback is these tools cost a lot of money. Companies often have a long-term strategy before investing in these tools.