First, I'd like to say I'm not looking for code, I'm looking for an algorithm.
I'm writing top-level testing of a complex real-time software system. It runs all software components (~20 processes, ~100 threads), sets up fake data sources (rtsp video sources) and feeds prepared data (video files) to the system, records system responses (events) and then stops the system after all prepared test data was sent.
As the test data is always the same, I expect tested system to provide correct responses (events) at correct times (from the start of the test).
I then compare generated responses (events) to expected events (manually prepared), which I expect to be all there, possibly with some small time variances which I would limit with some given time-tolerance
, let's say 5 seconds.
Let's say the tested system is supposed to detect animals in 1500 seconds long video and I watched it and written down 5 animals and time when they appeared in video:
at 10s - a sparrow
at 20s - a cat
at 50s - a rabbit
at 100s - an owl
at 1000s - a bear
Based on that, I would then write expected_events
set:
expected_events = [
Event(10, 'sparrow'),
Event(20, 'cat'),
Event(50, 'rabbit'),
Event(100, 'owl')
Event(1000, 'bear')
]
And I want to be able to tell how well the real detected events (which will be affected by processor load, disk usage, network usage atd, as this is multiprocess system on real computer) match these expected_eevents
.
Let's say the tested system returned:
detected_events = [
Event(10.1, 'sparrow'),
Event(19.5, 'cat'),
Event(50.2, 'rabbit'),
Event(99.3, 'owl')
Event(1000.2, 'bear')
]
Which I would evaluate as correct, 100% match with expected events, all events are present and time differences are below time-tolerance
:
matches = [
{'name': 'sparrow', 'detected': 10.1, 'expected': 10, 'time-diff': 0.1},
{'name': 'cat', 'detected': 19.5, 'expected': 20, 'time-diff': 0.5},
{'name': 'rabbit', 'detected': 50.2, 'expected': 50, 'time-diff': 0.2},
{'name': 'owl', 'detected': 99.3, 'expected': 100, 'time-diff': 0.7},
{'name': 'bear', 'detected': 1000.2, 'expected': 1000, 'time-diff': 0.2},
]
If the tested system returned:
detected_events = [
Event(10.1, 'sparrow'),
Event(50.2, 'rabbit'),
Event(99.3, 'owl')
Event(1010.5, 'bear')
]
which I would expect could result in matches like this:
raw_matches = [
{'name': 'sparrow', 'detected': 10.1, 'expected': 10, 'time-diff': 0.1},
{'name': 'cat', 'detected': None, 'expected': 20, 'time-diff': None},
{'name': 'rabbit', 'detected': 50.2, 'expected': 50, 'time-diff': 0.2},
{'name': 'owl', 'detected': 99.3, 'expected': 100, 'time-diff': 0.7},
{'name': 'bear', 'detected': 1010.5, 'expected': 1000, 'time-diff': 10.52},
]
pruned_matches = [
{'name': 'sparrow', 'detected': 10.1, 'expected': 10, 'time-diff': 0.1},
{'name': 'rabbit', 'detected': 50.2, 'expected': 50, 'time-diff': 0.2},
{'name': 'owl', 'detected': 99.3, 'expected': 100, 'time-diff': 0.7},
]
I would see this as fail, because:
So I what I need is a method for evaluating detected_events
against expected_events
to be able to evaluate how good the tested system works.
Because matching event types is essential to the problem and can be done as matching for every event type separately, I'll do following simplyfications:
int
to make it easier to readAs many of you pointed out in comments, I actually don't have metric for evaluating final matching beyond dismissing matches with time difference > time-tolerance
.
This makes it a bit harder, but I think it's intuitive - I know what should happen at which time, and I compare that to actual events and I'll try to match them as good as is possible to make sure:
detected_event
matching expected_event
must happen at the same time with given time tolerance.So this I would consider "correct" matches (with 5 second time tolerance):
matches = [
expected_events = [10, 20] => {'expected': 10, 'detected': 10},
detected_events = [10, 20] {'expected': 20, 'detected': 20},
]
matches = [
expected_events = [10, 20] => {'expected': 10, 'detected': 15},
detected_events = [15, 25] {'expected': 20, 'detected': 25},
]
matches = [
expected_events = [10, 20] => {'expected': 10, 'detected': 5},
detected_events = [ 5, 15] {'expected': 20, 'detected': 15},
]
matches = [
expected_events = [10, 11] => {'expected': 10, 'detected': 11},
detected_events = [11, 12] {'expected': 11, 'detected': 12},
]
matches = [
expected_events = [10, 20] => {'expected': 10, 'detected': 10},
detected_events = [10, 26] ]
expected_events = [10, 20] => matches = []
detected_events = [ 4, 26]
matches = [
expected_events = [10, 20, 30] => {'expected': 20, 'detected': 17},
detected_events = [17, 24] ]
This I would consider "bad" matches (i.e. this is not how I want it to work):
matches = [
expected_events = [10, 20] => {'expected': 20, 'detected': 15},
detected_events = [15, 25] ]
# matched only 1 events even though it's possible to match 2
matches = [
expected_events = [10, 11] => {'expected': 11, 'detected': 11},
detected_events = [11, 12] ]
# matched only 1 events even though it's possible to match 2
matches = [
expected_events = [10, 20] => {'expected': 10, 'detected': 4},
detected_events = [ 4, 26] {'expected': 20, 'detected': 26},
]
# should not match anything, time differences > 5sec
Code should look something like this:
expected_events = [10, 20, 50, 100, 1000] # times in second
detected_events = [11, 18, 51, 1001]
def metric(ev1, ev2):
return abs(ev1 - ev2)
matches = match_events(expected_events, detected_events, metric, tolerance=5)
Simple, naive approach - start from best match
I tried extremely simple approach:
This works for simple cases, but I run into a problem when events are "shifted":
expected_events = [10, 11]
detected_events = [11, 12]
Now I get:
[
{'expected': 11, 'detected': 11},
]
While I want to have:
[
{'expected': 10, 'detected': 11},
{'expected': 11, 'detected': 12},
]
Brute force - permutations
Then I tried brute force:
This works, but as you may have expected, it's too slow for anything longer than few elements. (I hoped to make it work using lazy evaluation, but couldn't make it work).
What would be a correct algorithm for such matching? It seems to me this may be already solved problem - but I don't know what to search for.
It is a solved problem - assignment problem - https://en.wikipedia.org/wiki/Assignment_problem - thanks to Yossi Levi!
I will follow your asking, and provide a non-programming approach, but only logical (semi-mathematical) approach as I see it.
Algorithm steps:
T
clarification: S[i,j] refer to the difference between i'th element from list A to j'th element in list B
for example:
0 0 0
B = 0 1 1
1 0 0
saying X dimension represent list A and Y represent list B then:
B[2] === A[2]
B[2] === A[3]
B[3] === A[1] (=== means that two elements satisfy the criteria of similarity)
Now - the question becomes more mathematical, in order to find the best match, you can use one of the following suggestions:
brute force (less elegant approach in my opinion):
score
score
more elegant approach:
Best approach:
While searching over the net on how to find the maximum Trace of matrix when only columns swapping allowed I ran into this (pay attention please to Example and Matrix interpretation sections). moreover, I found implementation of it in PyPI package.
As the documentation says - the algorithm is trying to find the minimum Trace over all possible permutations of matrix - so just invole with -B
as argument instead.
overall example (with more elegant approach as last step):
A = [1,5,7]
B = [6,6,2]
T = 2
S =
5 5 1
1 1 3
1 1 5
B =
0 0 1
1 1 0
1 1 0
permutations of columns:
1-
0 0 1
1 1 0
1 1 0
Trace = 1
other-diagonal = 3
Done -- > no need for more permutations because len(other-diagonal) == 3
A[1] =1 === B[3] =2
A[2] =5 === B[2] =6
A[3] =7 === B[1] =6
feel free to ask, or give any insights that you might have