I have a list:
l = [1,2,3,4,5,6,7,8,9,10,11]
And I want to get the mean out of segments of 3 elements. If the last group of elements doesn't divide in 3, just take the mean of what's remains:
new = [2,5,8,10.5]
What is the best way to do it? In terms of computation speed (As the obvious for loop will be slow for big lists)
1. Using lists
and numpy.mean
:
import numpy as np
l = [1,2,3,4,5,6,7,8,9,10,11]
l = [l[x:x+3] for x in range(0,len(l),3)]
new = [np.mean(lst) for lst in l]
print(new)
Output:
[2.0, 5.0, 8.0, 10.5]
Performance::
+-------------+----------------+
| List Length | Time Taken (s) |
+-------------+----------------+
| 1000 | 0.01 |
| 10000 | 0.05 |
| 100000 | 0.50 |
| 1000000 | 5.65 |
+-------------+----------------+
2. Using numpy arrays
and numpy.mean
:
import numpy as np
l = np.array([1,2,3,4,5,6,7,8,9,10,11])
l = [l[x:x+3] for x in range(0,len(l),3)]
new = [np.mean(lst) for lst in l]
print(new)
Output:
[2.0, 5.0, 8.0, 10.5]
Performance:
+-------------+----------------+
| List Length | Time Taken (s) |
+-------------+----------------+
| 1000 | 0.005 |
| 10000 | 0.03 |
| 100000 | 0.28 |
| 1000000 | 2.30 |
+-------------+----------------+
The second method is much faster than the first method. Hence, I strongly advice you to go with the second method.