Search code examples
javadroolsrule-enginedrools-fusion

Drools Fusion Multiple Streams & Pseudoclock


I'm currently working on my masters thesis which involves using Drools Fusion to process events coming from multiple streams of XML files (So I am 'replaying' each file as a stream). These files are of a football match taking place with GPS sensors attached to the players that monitors their acceleration and velocity and other stuff like player load etc.

Each XML file contains instances of events stating an ID, start time, end time and code as follows:

<file>
<SESSION_INFO>
<start_time>2015-09-17 19:02:31.31 +100</start_time>
</SESSION_INFO>
<SORT_INFO>
<sort_type>sort order</sort_type>
</SORT_INFO>
<ALL_INSTANCES>

<instance>
<ID>1</ID>
<start>0</start>
<end>1.51</end>
<code>Accel : 0.00 - 2.00</code>
</instance>

<instance>
<ID>2</ID>
<start>1.52</start>
<end>3.01</end>
<code>Accel : -2.00 - 0.00</code>
</instance>

<instance>
<ID>3</ID>
<start>3.02</start>
<end>4.01</end>
<code>Accel : 0.00 - 2.00</code>
</instance>

<instance>
<ID>4</ID>
<start>4.02</start>
<end>4.21</end>
<code>Accel : 2.00 - 4.00</code>
</instance>
</ALL_INSTANCES>

I have 9 of these files which all need to be processed concurrently and feed in these events simultaneously into the engine. My current implementation is of a JAXB unmarshaller to feed these events into the stream but no idea how to do it concurrently (ie: feed the first event in per stream, then the second event in per stream etc). I was looking into using threads for that part of the implementation, unless their is another tool I've missed in Drools to help do this. But searched fairly thoroughly and no comprehensive examples exist in processing multiple streams concurrently.

Another question I have is regarding the Pseudoclock. Because I have these 9 different streams with events happening at different times, I cannot advance the time after every insert because each event in each stream happens at a different time, therefore, these events won't line up. The time at which all these streams start is the same. For example, if I have instance 1 in the XML happening in the duration of 1.51, and another event from another stream with a duration of. Say. 4 seconds, and say I was to advance both of these events, then they would be out of sync from each other.

However, all my time related data exists in each stream. The Kick Off time is 19:02:31, and each event has a timestamp in that stream in seconds after kick off through the 'end' timestamp with the duration of each event of (end timestamp - start timestamp). The processing I need to do with these streams involves taking these acceleration events and correlating them with other streams whenever 2 or more players accelerate at the same rate at roughly the same duration/time interval.

Can anyone give me any pointers or assistance? To summarize, I need to know a better way of concurrently inserting streams into the engine and need to know if I need the pseudoclock for my implementation/processing. I am pretty much a beginner in programming so all I want is to get the system to run.

Thanks a lot!

Stu.


Solution

  • You don't need to process the nine XML files concurrently, i.e., distributed on threads. <instance> elements appear to be sorted according to start or end time (this may depend on what needs to be computed during an instance event), and you can process them all in their natural sequence - just determine what is next in the nine streams.

    This way, also your issue relating to the pseudo clock ceases to be a problem. You can easily advance the clock to the next instance event once you have determined it.

    Without knowing all the details, I think that each <instance> defines two events: the player starts moving and the player stops moving. And you may have to reasess the situation at each of these two events.