Search code examples
pythonnumpystandard-deviation

How to flatten a numpy array based on frequency to get the correct standard deviation?


I can very easily get the standard deviation of some numbers in a 1D list in numpy like below:

import numpy as np
arr1 = np.array([100, 100, 100, 200, 200, 500])
sd = np.std(arr1)
print(sd)

But my data is in the form of a 2D list, in which the second value of each inner list, is the frequency:

arr2 = np.array([[100, 3], [200, 2], [500, 1]])

How can I flatten it based on frequency (change arr2 into arr1), to get the correct standard deviation?


Solution

  • Use arr2[:, 0].repeat(arr2[:, 1]).