Why is a combination of numpy functions faster than np.mean?

I am wondering what the fastest way for a mean computation is in numpy. I used the following code to experiment with it:

import time
n = 10000
p = np.array([1] * 1000000)

t1 = time.time()
for x in range(n):
    np.divide(np.sum(p), p.size)
t2 = time.time()

print(t2-t1)

3.9222593307495117

t3 = time.time()
for x in range(n):
    np.mean(p)
t4 = time.time()

print(t4-t3)

5.271147012710571

I would assume that np.mean would be faster or at least equivalent in speed, however it looks like the combination of numpy functions is faster than np.mean. Why is the combination of numpy functions faster?

Solution

For integer input, by default, numpy.mean computes the sum in float64 dtype. This prevents overflow errors, but it requires a conversion for every element.

Your code with numpy.sum only converts once, after the sum has been computed.