Search code examples
pythonpython-2.7matplotlibhistogramcdf

Python Matplotlib histogram bin shift


I have created a cumulative (CDF) histogram from a list which is what I wanted. Then I subtracted a fixed value (by using x = [ fixed_value - i for i in myArray]) from each element in the list to essential just shift the bins over a fixed amount. This however makes my CDF histogram inverted in the y-axis. I thought it should look identical to the original except the x-axis (bins) are shifted by a fixed amount.

So can someone explain what I am doing wrong or give a solution to just shifting the bins over instead of recreating another histogram with a new array?

EDIT:

Sometimes I see this error:

>>> plt.hist(l,bins, normed = 1, cumulative = True)
C:\Python27\lib\site-packages\matplotlib\axes.py:8332: RuntimeWarning: invalid value encountered in true_divide
  m = (m.astype(float) / db) / m.sum()

But it is not exclusive to the second subtracting case. And plt.hist returns an NaN array. Not sure if this helps but I am getting closer to figuring it out I think.

EDIT: Here are my two graphs. The first is the "good" one. The second is the shifted "bad" one:

All I want to do is shift the first one bins over by a fixed amount. However, when I subtract the same value from each list that is in the histogram it seems to alter the histogram in the y direction and in the x-direction. Also, note how the first histogram are all negative values, and the second is positive. I seemed to fix it by keeping it negative (I use original_array[i] - fixed_value <0, instead fixed_value - original_array[i] > 0)


Solution

  • I think that the problem might be in how you calculate the shifted values. This example works fine for me:

    import numpy as np
    import matplotlib.pylab as pl
    
    original_array = np.random.normal(size=100)
    bins = np.linspace(-5,5,11)
    
    pl.figure()
    pl.subplot(121)
    pl.hist(original_array, bins, normed=1, cumulative=True, histtype='step')
    
    offset = -2
    modified_array = [original_value + offset for original_value in original_array]
    
    pl.subplot(122)
    pl.hist(modified_array, bins, normed=1, cumulative=True, histtype='step')
    

    enter image description here

    Note that numpy might make your life easier (and for large sizes of original_array, a lot faster); for example if your data is a numpy.array, you can also write it as:

    modified_array = original_array + offset