Given an array [x1, x2, x3, ..., xk ] where xi is the number of items in box i, how can I redistribute the items so that no box contains more than N items. N is close to sum(xi)/k -- That is, N is close to every box having the same number of items. Intermediate boxes shouldn't be used to carry items -- if x1 has a surplus and x2 and x3 have deficits, x1 should send some items to x2 and to x3, but not send all its items to x2 and then let x2 resolve the surplus.
The actual problem is: each computing node has a set of samples, and after a resampling step some computer nodes might have a surplus while others have a deficit, so I'd like to re-distribute the samples while minimizing communication.
I imagine this sort of problem has a name.
This problem could be modeled as an instance of minimum-cost flow.
Let G be a directed graph with vertices s, t, a1, …, ak, b1, …, bk and arcs s→ai of capacity xi and cost 0, arcs ai→bj of infinite capacity and cost 0 if i = j and cost 1 if i ≠ j, and arcs bj→t of capacity N and cost 0. We want the minimum-cost flow that moves ∑i xi units from s to t. The idea is that if y units flow ai→bj, then y items are moved from box i to box j (if i ≠ j; if i = j, then no move occurs).
Using minimum-cost flow to solve this simple problem is of course using a sledgehammer to crack a nut, but several variants can be modeled as network flow problems.
If a pair of nodes i, j is not directly connected, remove the ai→bj arc.
We can start up and shut down nodes by giving them vertices on only the "a" side or the "b" side.
We can model differences in communication speeds by adjusting the costs from uniform.
We can limit the number of items two nodes exchange by limiting the capacity of their arc.
We can introduce more interior vertices and change the complete bipartite topology of the connections to model the effects of network contention.