Search code examples
pythonperformancenumpylogarithm

Faster estimation of logarithm operation


I have a fairly simple function involving a logarithm of base 10 (f1 shown below). I need it to run as fast as possible since it is called millions of times as part of a larger code.

I tried with a Taylor approximation (f2 below) but even with a large expansion the accuracy is very poor and, even worse, it ends up taking a lot more time.

Have I reached the limit of performance attainable with numpy?

import time
import numpy as np

def f1(m1, m2):
    return m1 - 2.5 * np.log10(1. + 10 ** (-.4 * (m2 - m1)))


def f2(m1, m2):
    """
    Taylor expansion of 'f1'.
    """
    x = -.4 * (m2 - m1)
    return m1 - 2.5 * (
        0.30102999 + .5 * x + 0.2878231366 * x ** 2 -
        0.0635837 * x ** 4 + 0.0224742887 * x ** 6 -
        0.00904311879 * x ** 8 + 0.00388579 * x ** 10)


# The data I actually use has more or less this range.
N = 1000
m1 = np.random.uniform(5., 30., N)
m2 = np.random.uniform(.7 * m1, m1)

# Test both functions
M = 5000
s = time.clock()
for _ in range(M):
    mc1 = f1(m1, m2)
t1 = time.clock() - s
s = time.clock()
for _ in range(M):
    mc2 = f2(m1, m2)
t2 = time.clock() - s

print(t1, t2, np.allclose(mc1, mc2, 0.01))

Solution

  • With this code-snippet, i'm not sure if you should optimize the log, but more the whole vector-expression itself.

    You can try numexpr (Fast numerical array expression evaluator for Python, NumPy,...), which might do a lot for you.

    The idea to try this came from Ignacio's comment which made me think where his speedup is coming from (i'm sure, it's not coming from the log calculation itself).

    In my simple modification of your code:

    import numexpr as ne
    def f1(m1, m2):
       return ne.evaluate("m1 - 2.5 * log10( 1.0 + 10 ** (-0.4 * (m2-m1)))")
    

    it seems the above is 5 - 6x times as fast as (an unoptimized) f2 (approximation), while still giving the original accuracy.

    It's also nearly twice as fast as the original numpy-approach f1.

    These numbers might change depending on numexpr's setup as Intels MKL for example could be used too. As i'm too lazy to check my anaconda-based setup, i offer this just as a tech-demo, which everyone can try out too.

    While i used numexpr a few times in the past for simple stuff, i might add, that it's also used within pandas, just to mention a real-world project depending on it's correct workings.

    Disclaimer: i used your benchmark as template (and hope caching and co does not play a role).