Search code examples
pythonnumpypytorch

Generate array of positive integers that sum of to k


My task is simple: I want to generate an (ideally numpy) array containing all combinations of m positive (>=0), but bounded (<= e) integers that sum exactly to k. Note that k and m might be relatively high, so generating all combinations and filtering will not work.

I have implemented it in plain, recursive python but this small functions takes most of my time and I need to replace it to perform better. I have tried to come up with numpy/pytorch code to generate this array but I didn't manage to do it so far.

I currently use numpy and pytorch in my project, but I am open to other libraries as long as I write python code and I get something I can convert to numpy arrays in the end.

Here's some code:

import timeit


def get_summing_up_to(max_degree, sum, length, current=0):
    assert sum >= 0
    assert length >= 1
    if length == 1:
        residual = sum - current
        if residual <= max_degree:
            return [(residual,)]
        else:
            return []
    max_element = min(max_degree, sum - current)
    return [
        (i,) + t
        for i in range(max_element + 1)
        for t in get_summing_up_to(
            max_degree, sum, length - 1,
            current=current + i
        )
    ]


if __name__ == '__main__':
    result = timeit.timeit('get_summing_up_to(60, 60, 6)', globals=globals(), number=1)
    print(f"Execution time: {result} for max_degree=60, sum=60, length=6")

    result = timeit.timeit('get_summing_up_to(30, 30, 8)', globals=globals(), number=1)
    print(f"Execution time: {result} for max_degree=30, sum=30, length=8")

Solution

  • One thing to notice is your function generates redundant outputs. ie get_summing_up_to(30, 30, 8) would contain (30, 0, 0, 0, 0, 0, 0, 0), (0, 30, 0, 0, 0, 0, 0, 0), ....

    One way to make this more efficient is to generate unique integer combinations excluding 0s from 1 to max_length. We can also add caching to the sub-problem of generating partitions for added efficiency.

    from functools import lru_cache
    
    def get_summing_up_to_minimal(max_value, target_sum, max_length):
        # optional caching - setting maxsize recommended 
        @lru_cache(maxsize=None)
        def generate_partitions(remaining_sum, max_val, length):
            # Early pruning conditions
            if remaining_sum < 0:
                return []
            if length == 0:
                return [()] if remaining_sum == 0 else []
            
            # Minimum possible sum with given length (using all 1's)
            if remaining_sum < length:
                return []
            
            # Maximum possible sum with given length (using max_val)
            if remaining_sum > max_val * length:
                return []
                
            # Base case for length 1
            if length == 1:
                return [(remaining_sum,)] if remaining_sum <= max_val else []
    
            results = []
            # Optimize the start value
            start = min(max_val, remaining_sum)
            # Calculate minimum value needed to achieve remaining_sum with remaining length
            min_required = (remaining_sum - 1) // length + 1
            
            # Iterate only through viable values
            for i in range(start, min_required - 1, -1):
                # Early pruning: check if remaining values can sum to target
                remaining_length = length - 1
                remaining_target = remaining_sum - i
                
                # If maximum possible sum with remaining length is too small, break
                if i * remaining_length < remaining_target:
                    break
                    
                # If minimum possible sum with remaining length is too large, continue
                if remaining_target < remaining_length:
                    continue
                    
                sub_partitions = generate_partitions(
                    remaining_target,
                    min(i, max_val),
                    remaining_length
                )
                
                for sub_partition in sub_partitions:
                    results.append((i,) + sub_partition)
            
            return results
    
        all_partitions = []
        # Only try lengths that could possibly work
        min_length = (target_sum - 1) // max_value + 1
        max_possible_length = min(max_length, target_sum)
        
        for length in range(min_length, max_possible_length + 1):
            partitions = generate_partitions(target_sum, max_value, length)
            all_partitions.extend(partitions)
        
        return all_partitions
    

    If the full output with redundant results is required, we can generate them after the fact using the minimal set of outputs from get_summing_up_to_minimal:

    from itertools import permutations
    
    def expand_partitions(compact_partitions, max_length):
        result = []
        
        for partition in compact_partitions:
            # Calculate how many zeros we need to add
            zeros_needed = max_length - len(partition)
            if zeros_needed < 0:
                continue
                
            # Create the full partition with zeros
            full_partition = partition + (0,) * zeros_needed
            
            # Generate all unique permutations
            # Using a set to handle cases where partition contains duplicate numbers
            result.extend(set(permutations(full_partition)))
        
        return result
    

    Note that expanding partitions would be the bulk of compute time.

    I profiled the following:

    # run 1, your original code
    out = get_summing_up_to(30, 60, 6)
    
    # run 2, generating just minimal outputs
    out = get_summing_up_to_minimal(30, 60, 6)
    
    # run 3, generate minimal outputs and expand to full outputs
    out = get_summing_up_to_minimal(30, 60, 6)
    out = expand_partitions(out, 8)
    

    On my machine, your original code takes ~4.6 seconds. Generating the minimal outputs takes ~17.9 milliseconds. Generating the minimal outputs and expanding takes ~1.1 seconds.

    If your downstream use case doesn't require the redundant combinations, you can save a lot of time just generating the minimal set. If you need to pack the outputs into a numpy array/torch tensor (requiring everything be the same length), you can pad the minimal outputs to max_length with zeros without generating all the combinations. ie:

    out = get_summing_up_to_minimal(30, 60, 6)
    out_array = np.zeros((len(out), 6))
    
    for i, o in enumerate(out):
        out_array[i][:len(o)] = sorted(o)