matching elements from one set to another using given metric

First, I'd like to say I'm not looking for code, I'm looking for an algorithm.

Motivation:

I'm writing top-level testing of a complex real-time software system. It runs all software components (~20 processes, ~100 threads), sets up fake data sources (rtsp video sources) and feeds prepared data (video files) to the system, records system responses (events) and then stops the system after all prepared test data was sent.

As the test data is always the same, I expect tested system to provide correct responses (events) at correct times (from the start of the test).

I then compare generated responses (events) to expected events (manually prepared), which I expect to be all there, possibly with some small time variances which I would limit with some given time-tolerance, let's say 5 seconds.

Let's say the tested system is supposed to detect animals in 1500 seconds long video and I watched it and written down 5 animals and time when they appeared in video:

at   10s - a sparrow
at   20s - a cat
at   50s - a rabbit
at  100s - an owl
at 1000s - a bear

Based on that, I would then write expected_events set:

expected_events = [
    Event(10, 'sparrow'),
    Event(20, 'cat'),
    Event(50, 'rabbit'),
    Event(100, 'owl')
    Event(1000, 'bear')
]

And I want to be able to tell how well the real detected events (which will be affected by processor load, disk usage, network usage atd, as this is multiprocess system on real computer) match these expected_eevents.

Let's say the tested system returned:

detected_events = [
    Event(10.1, 'sparrow'),
    Event(19.5, 'cat'),
    Event(50.2, 'rabbit'),
    Event(99.3, 'owl')
    Event(1000.2, 'bear')
]

Which I would evaluate as correct, 100% match with expected events, all events are present and time differences are below time-tolerance:

matches = [
    {'name': 'sparrow', 'detected': 10.1,   'expected': 10,   'time-diff': 0.1},
    {'name': 'cat',     'detected': 19.5,   'expected': 20,   'time-diff': 0.5},
    {'name': 'rabbit',  'detected': 50.2,   'expected': 50,   'time-diff': 0.2},
    {'name': 'owl',     'detected': 99.3,   'expected': 100,  'time-diff': 0.7},
    {'name': 'bear',    'detected': 1000.2, 'expected': 1000, 'time-diff': 0.2},
]

If the tested system returned:

detected_events = [
    Event(10.1, 'sparrow'),
    Event(50.2, 'rabbit'),
    Event(99.3, 'owl')
    Event(1010.5, 'bear')
]

which I would expect could result in matches like this:

raw_matches = [
    {'name': 'sparrow', 'detected': 10.1,   'expected': 10,   'time-diff': 0.1},
    {'name': 'cat',     'detected': None,   'expected': 20,   'time-diff': None},
    {'name': 'rabbit',  'detected': 50.2,   'expected': 50,   'time-diff': 0.2},
    {'name': 'owl',     'detected': 99.3,   'expected': 100,  'time-diff': 0.7},
    {'name': 'bear',    'detected': 1010.5, 'expected': 1000, 'time-diff': 10.52},
]

pruned_matches = [
    {'name': 'sparrow', 'detected': 10.1,   'expected': 10,   'time-diff': 0.1},
    {'name': 'rabbit',  'detected': 50.2,   'expected': 50,   'time-diff': 0.2},
    {'name': 'owl',     'detected': 99.3,   'expected': 100,  'time-diff': 0.7},
]

I would see this as fail, because:

it didn't detected a cat
bear was detected 10.5 seconds too late
therefore only 3 out of 5 was really matched and result should be 60% match

So I what I need is a method for evaluating detected_events against expected_events to be able to evaluate how good the tested system works.

Simplyfication

Because matching event types is essential to the problem and can be done as matching for every event type separately, I'll do following simplyfications:

all events are the same - i.e. only the time of the event is important and therefore event will be represented only by the timestamp
timestamp will be int to make it easier to read

What do I consider to be a "good" match:

As many of you pointed out in comments, I actually don't have metric for evaluating final matching beyond dismissing matches with time difference > time-tolerance. This makes it a bit harder, but I think it's intuitive - I know what should happen at which time, and I compare that to actual events and I'll try to match them as good as is possible to make sure:

as many expected events are matched as possible
each detected_event matching expected_event must happen at the same time with given time tolerance.

So this I would consider "correct" matches (with 5 second time tolerance):

                                  matches = [
expected_events = [10, 20]  =>        {'expected': 10,    'detected': 10},
detected_events = [10, 20]            {'expected': 20,    'detected': 20},
                                  ]

                                  matches = [
expected_events = [10, 20]  =>        {'expected': 10,    'detected': 15},
detected_events = [15, 25]            {'expected': 20,    'detected': 25},
                                  ]

                                  matches = [
expected_events = [10, 20]  =>        {'expected': 10,    'detected':  5},
detected_events = [ 5, 15]            {'expected': 20,    'detected': 15},
                                  ]

                                  matches = [
expected_events = [10, 11]  =>        {'expected': 10,    'detected': 11},
detected_events = [11, 12]            {'expected': 11,    'detected': 12},
                                  ]


                                  matches = [
expected_events = [10, 20]  =>        {'expected': 10,    'detected': 10},
detected_events = [10, 26]        ]

expected_events = [10, 20]  =>    matches = []
detected_events = [ 4, 26]

                                  matches = [
expected_events = [10, 20, 30] =>     {'expected': 20,    'detected': 17},
detected_events = [17, 24]        ]

This I would consider "bad" matches (i.e. this is not how I want it to work):

                                  matches = [
expected_events = [10, 20]  =>        {'expected': 20,    'detected': 15},
detected_events = [15, 25]        ]
# matched only 1 events even though it's possible to match 2

                                  matches = [
expected_events = [10, 11]  =>        {'expected': 11,    'detected': 11},
detected_events = [11, 12]        ]
# matched only 1 events even though it's possible to match 2


                                  matches = [
expected_events = [10, 20]  =>        {'expected': 10,    'detected': 4},
detected_events = [ 4, 26]            {'expected': 20,    'detected': 26},
                                  ]
# should not match anything, time differences > 5sec

Pseudocode / What I tried:

Code should look something like this:

expected_events = [10, 20, 50, 100, 1000] # times in second
detected_events = [11, 18, 51,      1001]

def metric(ev1, ev2):
    return abs(ev1 - ev2)

matches = match_events(expected_events, detected_events, metric, tolerance=5)

Simple, naive approach - start from best match

I tried extremely simple approach:
- product of (expected_events, detected_events)
- compute time difference
- filter matches with time difference greater than given tolerance
- sort matches by time difference
- start matching from first matches and discard "conflicts" (i.e. latter matches using same elements)
This works for simple cases, but I run into a problem when events are "shifted":
```
 expected_events = [10, 11]
 detected_events = [11, 12] 
```
Now I get:
```
 [
     {'expected': 11,    'detected': 11},
 ]
```
While I want to have:
```
 [
     {'expected': 10,    'detected': 11},
     {'expected': 11,    'detected': 12},
 ]
```
Brute force - permutations

Then I tried brute force:
- product of (expected_events, detected_events)
- compute time difference
- filter matches with time difference greater than given tolerance
- create all possible permutations of given matches
- for each permutation, start matching from first matches and discard "conflicts" (i.e. latter matches using same elements)
- compute lengths of all those matchings
- keep only those with maximal length
- select matching with min(sort by sum of all time differences)
This works, but as you may have expected, it's too slow for anything longer than few elements. (I hoped to make it work using lazy evaluation, but couldn't make it work).

Question:

What would be a correct algorithm for such matching? It seems to me this may be already solved problem - but I don't know what to search for.

It is a solved problem - assignment problem - https://en.wikipedia.org/wiki/Assignment_problem - thanks to Yossi Levi!

Solution

I will follow your asking, and provide a non-programming approach, but only logical (semi-mathematical) approach as I see it.

Algorithm steps:

lets define the threshold as T
pad the smaller list with don't care values (None for example) - just to make sure the dimension agreed
create similarity matrix by take absolute value of decreasing each element of one list to element in other list, let's define the matrix as S.

clarification: S[i,j] refer to the difference between i'th element from list A to j'th element in list B

create binary matrix B in which every element that satisfy the threshold critirea is 1, otherwise it 0 (MATLAB-> B=S<L)

for example:

       0 0 0
    
B =    0 1 1
    
       1 0 0

saying X dimension represent list A and Y represent list B then:

B[2] === A[2]
B[2] === A[3]
B[3] === A[1] (=== means that two elements satisfy the criteria of similarity)

Now - the question becomes more mathematical, in order to find the best match, you can use one of the following suggestions:

brute force (less elegant approach in my opinion):

choose element which is 1 (among non marked rows and columns)
mark his entire column and row as marked
keep choose another 1 until there aren't legal places and sum it up to score
iteratively do it for all options and choose the one with the highest score

more elegant approach:

iterate over all permutations of columns (for B matrix)
if Trace or opposite-diagonal equals to len(A) then finish (found matching for all elements)
choose the permutation with highest TRACE (or the opposite diagonal) number of permutations in worst case are of-course N! (which N is the dimension of B)

Best approach:

While searching over the net on how to find the maximum Trace of matrix when only columns swapping allowed I ran into this (pay attention please to Example and Matrix interpretation sections). moreover, I found implementation of it in PyPI package.

As the documentation says - the algorithm is trying to find the minimum Trace over all possible permutations of matrix - so just invole with -B as argument instead.

overall example (with more elegant approach as last step):

A = [1,5,7]
B = [6,6,2]

T = 2

S = 
5 5 1 
1 1 3
1 1 5

B = 
0 0 1
1 1 0
1 1 0

permutations of columns:

1- 
0 0 1
1 1 0
1 1 0

Trace = 1
other-diagonal = 3

Done -- > no need for more permutations because len(other-diagonal) == 3
A[1] =1 === B[3] =2
A[2] =5 === B[2] =6
A[3] =7 === B[1] =6

feel free to ask, or give any insights that you might have