Python built-in sum for nested list

When I finished leetcode 1313, I find a special usage of built-in sum function.

The Leetcode Problem 1313

We are given a list nums of integers representing a list compressed with run-length encoding.

Consider each adjacent pair of elements [a, b] = [nums[2*i], nums[2*i+1]] (with i >= 0). For each such pair, there are a elements with value b in the decompressed list.

Return the decompressed list.

Example 1:

Input: nums = [1,2,3,4]
Output: [2,4,4,4]
Explanation: The first pair [1,2] means we have freq = 1 and val = 2 so we generate the array [2].
The second pair [3,4] means we have freq = 3 and val = 4 so we generate [4,4,4].
At the end the concatenation [2] + [4,4,4,4] is [2,4,4,4].

There is a solution using `sum`

nums = [1,2,3,4]
g = ([b] * a for a, b in zip(nums[::2], nums[1::2]))
print(list(g))
g = ([b] * a for a, b in zip(nums[::2], nums[1::2]))
print(sum(g,[]))

Output:

[[2], [4, 4, 4]]
[2, 4, 4, 4]

My question

I can't tell why sum can deal with a nested list in this situation. Can any one tell me about it? Or some other functions behavior like this?

Here is the official guide for built-in sum.

Solution

Short-answer

The given code-fragment runs successive list concatenations.

How it works

Roughly the built-in sum() function works like this:

def sum(iterable, /, start=0):
    total = start
    for x in iterable:
        total = total + x
    return total

The + operator calls __add__ on the left-hand operand so that 3 + 4 runs as (3).__add__(4), an addition operation on integers. Likewise, [10, 20] + [30, 40, 50] runs as [10, 20].__add__([30, 40, 50]), a concatenation operation on lists.

Here's how it plays out in the given example:

>>> nums = [1,2,3,4]
>>> g = ([b] * a for a, b in zip(nums[::2], nums[1::2]))
>>> result = []
>>> x = next(g)
>>> result = result + x
>>> result
[2]
>>> x = next(g)
>>> result = result + x
>>> result
[2, 4, 4, 4]

Why it is not a good idea

Successive list concatenations make next list after each addition, so they run at O(n**2) speed, meaning that this a quadratic algorithm that runs excessively slow when given large inputs.

Better way

Instead of building new lists at each step, just extend the base list in-place. This runs at O(n) linear speed:

>>> nums = [1,2,3,4]
>>> g = ([b] * a for a, b in zip(nums[::2], nums[1::2]))
>>> result = []                 # new list
>>> for x in g:
        result.extend(x)        # extend in-place

>>> result
[2, 4, 4, 4]

Even better way

The itertools module provides a tool for chaining together iterators. This makes short-work of the problem:

>>> nums = [1,2,3,4]
>>> g = ([b] * a for a, b in zip(nums[::2], nums[1::2]))
>>> list(chain.from_iterable(g))
[2, 4, 4, 4]

This solution is short, fast and works well even with large inputs.