Search code examples
algorithmpowershelllogic

Splitting an array of objects into multiple "buckets."


I assume this is like CompSci 101, but I've never taken a formal coding class. I'm just an admin self taught through experience by necessity. I prefer to figure things out on my own from as close to "scratch" as is reasonably possible, but this one is proving to be a bit confounding at my mediocre IQ and I would appreciate some help. I've kind of got a... a pretzel in my head.

I've been thinking recently about how I could efficiently arrange an array of objects, each with a numerical value of varying size, into the minimum possible quantity of fixed size "buckets." Like if I had 1,000 instances of values from 1-100 and I wanted to figure out how to arrange them into the least number of buckets assuming the sum of the values in each bucket (not the quantity of objects) cannot exceed 100, or 200, or whatever size. Not just how many buckets do I need. That's like 1st or 2nd grade math.

If I'm not explaining that well enough, think of a problem like this. UPS has 100,000 packages. Each weighs anywhere from 1-100 pounds, and you know how much each weighs. Each UPS truck can carry 10,000 pounds. Which packages should go onto which trucks to minimize the number of trucks needed? No, I don't work for UPS. That's just the first analogy which came to mind. Yes, I know I'm discounting volume in this analogy. This isn't specifically for a job, a class, or a test. It's just been bugging me lately and making me feel stupid.

I've done a few things in the past. They all feel rather "ghetto" and imprecise, but they're easy for me to comprehend, quick to code, and quicker to execute when I want to batch objects and distribute their processing rather than having a simple serial loop on a single array of objects executing in one place.

I've sorted the objects by descending value, decided on my bucket size and quantity by whatever criteria appropriate for the task, then iterated through the array sequentially assigning each object to the next bucket, wrapping back around to the first bucket as I run out of buckets. If I were sorting the exact values of 1-100 into 3 buckets, 100 goes into bucket 1, 99 into bucket 2, 98 into bucket 3, 97 into bucket 1, and so on.

I've also picked a bucket size and iterated through the array assigning to the current bucket until it runs out of capacity, then created a new bucket for the next object, while also checking each previous bucket on each current object to see if one of them has sufficient remaining capacity to accommodate it before creating a new bucket for it.

And some variations and combinations of those.

I'm not really looking for a fully coded language specific solution (I think I'm reasonably competent with PowerShell, and JavaScript to a lesser extent) or to be spoon fed "the answer." I'm looking for a high level overview or a bit of direction on how problems like this are addressed to get my light bulb to turn on enough to finish on my own. This seems like a common enough problem, but the verbiage I've used when searching doesn't turn up anything close to what I'm trying to accomplish.


Solution

  • The family of problems you've described in abstract is more commonly known as "bin packing problems", with your specific constraints describing what we might call a "minimum-bin instance packing problem" - having to pack a finite set of items with differing sizes into a smallest-possible set of bins, each of a given maximum capacity.

    The attempts you've described all sound like so-called first-fit algorithms - as you've probably found they can be implemented relatively efficiently (in terms of both computational and source code complexity), but sometimes end up producing sub-optimal packing distributions, meaning you'll end up using more bins/buckets than you could have.

    Bin packing is closely related to other packing problems, such as the 0-1 variant of the Knapsack Problem - technically a bin packing problem with a maximum bin count of 1.