I have a matcher like so:
type matcher struct {
all []int
none []int
any [][]int
}
For an array of integers to match, it must contain all of the integers in all
, none of the integers in none
, and must contain at least one integer from each array of integers in any
. I have a factory to make the construction of such a matcher easy, but once I have the factory fields I need to "flatten" the structure to make sure none of the terms are contradictory. Right now, what I have is (pseudocode):
all = [0, 1, 2, 3] // means all of these numbers must be present for the matcher to match
any = [[4, 5], [6, 7, 8]] // means one number from each of these groups must be present
none = [9,10] // means none of these may be present
sort(all)
sort(none)
if len(intersections(all, none)) != 0 {
panic("All and none are contradictory!")
}
for gIndex, group in any {
sort(group)
for index, integer in group {
// If all contains the integer...
if binarySearch(all, integer) {
swap(group[len(group)-1], group[index]) // swap the last item with the integer...
group = group[:len(all)-1] // and truncate the group by one; most efficient way to delete an item.
}
// If none contains the integer...
if binarySearch(none, integer) {
panic("item in any is also found in none; possible contradiction")
}
}
if len(group) < 1 {
// delete group from any
} else if len(group) == 1 {
// put group[0] (only integer in it) into all, since it must be met
} else {
sort(group) // Re-sort the group
any[gIndex] = group // Update the group
}
}
removeDuplicates(all)
removeDuplicates(none)
sort2D(any) // Sort by their greatest member
removeDuplicates2D(any) // Go through and remove any duplicate arrays.
all, any, and none should now be reduced to their simplest form, so that checking if an integer array matches should be more efficient and without contradictions.
Is there a better way to do this, and am I missing anything here?
There are at least two flaws in your solution.
You are removing integers in all
from the any
groups. In fact you have to remove the whole group instead of removing an item if there is an intersection with all
. any
groups containing items from all
are just redundant.
Having overlaps with the none
group is not necessarily a problem. These items could in fact just be removed from the any
groups. But then you might want to throw an error when this leads to an empty any
group. A matcher with an empty any
group won't match anything. Anyway, don't just quietly remove empty any
groups!
Something similar applies for any
groups subsuming other any
groups. Consider this choice of any:
any = [[4, 5], [4, 5, 6]]
Let's assume that all
and none
are empty. Then this would remain unchanged by your algorithm. However, the second group is clearly redundant. Any array containing at least one element of the first group will also match the second.
This means that at least you have to remove all any
arrays containing a superset of others.
I'm confident that this leads to a canonical form that you can use to compare matchers. However, I may be wrong.