Search code examples
pythonpython-3.xperformancenullcoalesce

Is there a more efficient way of writing the coalesce function?


After looking through some code, the following function was found:

def coalesce(*args, null=None):
    return next((obj for obj in args if obj is not null and obj != null), null)

Is there a more efficient way to have this operation run or a more Pythonic way of thinking about the problem?

The first alternative tried was the following:

def coalesce(*args):
    return next(filter(None, args), None)

Here is the second alternative that was tried:

def coalesce(*args, null=None):
    return next(itertools.filterfalse(functools.partial(operator.eq, null), args), null)

This is a third alternative that came to mind:

def coalesce(*args):
    return next((obj for obj in args if obj is not None), None)

A fourth alternative was written in the hopes that code written in C would be faster:

def coalesce(*args):
    return next(itertools.filterfalse(functools.partial(operator.is_, None), args), None)

Using timeit, the timing results for the three different functions were:

  • 0.7040689999994356
  • 0.3396129999891855
  • 0.8870604000112507
  • 0.5313313000078779
  • 0.8086609000019962

This would seem to indicate that the second function is preferable, but that does not answer the question of which is most Pythonic.


Solution

  • Have you considered

    def coalesce4(*args):
        for x in args:
            if x is not None:
                return x
    

    which is significantly faster than the three functions shown in your question:

    In [2]: import tmp
    
    In [3]: %timeit tmp.coalesce1(None, None, 1)
    782 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
    
    In [4]: %timeit tmp.coalesce2(None, None, 1)
    413 ns ± 8.36 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
    
    In [5]: %timeit tmp.coalesce3(None, None, 1)
    678 ns ± 0.782 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
    
    In [6]: %timeit tmp.coalesce4(None, None, 1)
    280 ns ± 0.218 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
    

    In Python 3.8, you'll have the option of

    def coalesce5(*args):
        if any(rv := x for x in args if x is not None):
             return rv
    

    which is effectively the same as your third option, and the running time of a similar function shows it to be about the same speed (~680 ns).