Search code examples
pythonnumpymatplotlibhistogrammedian

Plotting Medians of Bins of Data


I have computed the bins of my data x values and also the median of each bin corresponding to a y value. Below is a sample of my data I used to compute:

  • x values range from 0 to 1
  • y values range anywhere from any value but each y value has a x value associated with it

My code:

hist1, bins1 = np.histogram(x)
medians_1 = pd.Series(y).groupby(pd.cut(x, bins1)).median()

hist = [129, 126, 94, 133, 179, 206, 142, 147, 90, 185] 
bins = [0.,         0.09999926, 0.19999853, 0.29999779, 0.39999706,
    0.49999632, 0.59999559, 0.69999485, 0.79999412, 0.8999933,
    0.99999265]
medians_1 = [ 14.42145   14.428275  14.427865  14.42535   14.42613 
14.430235 14.441055  14.43472   14.424155  14.4187  ]

I am wondering how I can plot the median values for each associated "bin"?

I tried to plot a scatter plot but I only have the median values and not any x axis values. Also, I can not plot the medians vs. the original x values because they are not the same size.


Solution

  • You could plot the median against the bin centers. The center of each bin can be calculated using this

    import numpy as np
    bin_center = (np.asarray(bins[1:])+np.asarray(bins[:-1]))/2