I'm trying to find an algorithm to find K
disjoint, contiguous subsets of size L
of an array x
of real numbers that maximize the sum of the elements.
Spelling out the details, X is a set of N positive real numbers:
X={x[1],x[2],...x[N]} where x[j]>=0 for all j=1,...,N.
A contiguous subset of length L called S[i]
is defined as L consecutive members of X starting at position n[i]
and ending at position n[i]+L-1
:
S[i] = {x[j] | j=n[i],n[i]+1,...,n[i]+L-1} = {x[n[i]],x[n[i]+1],...,x[n[i]+L-1]}.
Two of such subsets S[i]
and S[j]
are called pairwise disjoint (non-overlapping) if |n[i]-n[j]|>=L
. In other words, they don't contain any identical members of X.
Define the summation of the members of each subset:
SUM[i] = x[n[i]]+x[n[i]+1]+...+x[n[i]+L-1];
The goal is find K contiguous and disjoint(non-overlapping) subsets S[1],S[2],...,S[K]
of length L such that SUM[1]+SUM[2]+...+SUM[K]
is maximized.
This is solved by dynamic programming. Let M[i]
be the best solution only for the first i
elements of x
. Then:
M[i] = 0 for i < L
M[i] = max(M[i-1], M[i-L] + sum(x[i-L+1] + x[i-L+2] + ... + x[i]))
The solution to your problem is M[N]
.
When you code it, you can incrementally compute the sum (or simply pre-compute all the sums) leading to an O(N
) solution in both space and time.
If you have to find exactly K
subsets, you can extend this, by defining M[i, k]
to be the optimal solution with k
subsets on the first i
elements. Then:
M[i, k] = 0 for i < k * L or k = 0.
M[i, k] = max(M[i-1, k], M[i-L, k-1] + sum(x[i-L+1] + ... + x[i])
The solution to your problem is M[N, K]
.
This is a 2d dynamic programming solution, and has time and space complexity of O(NK
) (assuming you use the same trick as above for avoiding re-computing the sum).