Search code examples
arraysalgorithmdata-structuresheapmultiset

Data structure to efficiently merge up to n elements of multiset


I am trying to solve the following problem. Let's define 2 multiset operations:

  1. |M,e| operation for an ordered multiset as getting lowest e elements of the multiset M. For example, |{1,1,2,3,3,4},4|={1,1,2,3}.
  2. Alternative sum set [E,F,G,e] as:
    • [E,F,G,e]=|E+F,e|, if sum all elements of multiset E is even,
    • [E,F,G,e]=|E+G,e|, if sum all elements of multiset E is odd

For example: [{0,2},{1,2},{0,3},3]=|{0,2}+{1,2},3|=|{0,2,1,2},3|=|{0,1,2,2},3|={0,1,2}

The problem statement is: give a family of n non-empty natural numbers multisets: X={x(0), x(1), …, x(n-1)} and number m and k, find the sum of multiset being the result of the following operation: […[[{},x(i_0),x(j_0),m],x(i_1),x(j_1),m]…,x(i_(k-1)),x(j_(k-1)),m], i.e. applying k times the alternative sum operation on the starting set ({}), for some give <i,j> pairs.

Right now I have solve this problem using an array representation of multisets and merging two multisets as efficient as I can merge two ordered arrays, cropping its size at the same time (in O(l) where l is min() of the sizes of both arrays being merged) and summing the result array at the same time.

But I think there might be a faster tree/heap-based solution that would allow me to merge the multisets faster and keep information about sum of the multiset in more flexible way.

What data structure would be the best choice in this case?


Solution

  • There certainly are heap-like data structures that can do the merge more quickly, for example the Fibonacci heap or pairing heap. However, any heap-like data structure will be slower at finding the m smallest entries; this will almost always be O(m log(n)). If the values for m are typically small, it could be worth trying, but if m is typically not too much smaller than n, I would guess your current solution is probably faster overall.