Binning data and calculating MAE for each bin in Python

I have two arrays:

Obs=([])
abs_error=([])

I want to use Obs to define the bins. For example, Where Obs is 1 to 2, bin abs_error into bin#1. Then where Obs is 2 to 3, bin abs_error into bin#2. etc.

Once I have my binned abs_error (which was binned by Obs) I want to calculate the mean of each bin and then plot the mean of each bin on the y-axis vs the bins on the x-axis.

How do I go about binning the abs_error by bins defined by the Obs? And how do I calculate the mean of each bin once this is done?

Right now I have:

abs_error=np.array([2.214033842086792 2.65031099319458 2.021354913711548 ... 2.831442356109619 1.9227538108825684 0.19358205795288086])
obs=np.array([3.3399999141693115 1.440000057220459 1.2799999713897705 ... 5.78000020980835 6.050000190734863 7.75])
bin_boundaries=np.array([0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0,17.0,18.0,19.0,20.0])

idx = np.digitize(obs, bin_boundaries)
mn_ = np.bincount(idx,abs_error) / np.bincount(idx)
print mn

[83.09254473  3.18577858  2.82887524  2.78532805  2.43264693  1.96835116 1.77645996  1.66138196  1.5972414   1.57512014  1.53094066  1.7965252 1.98050336  2.29916244  3.06640482  4.66769505  3.16787195]

I can't print the whole arrays because they are very big.

Solution

If your bins are all the same size you can use floor division to obtain bin indices from Obs, in your example.

idx = (Obs // 1).astype(int)

If not use np.digitize instead.

idx = np.digitize(Obs, bin_boundaries)

Once you have indices use them with np.bincount to obtain the means.

mn = np.bincount(idx, abs_error) / np.bincount(idx)