Search code examples
pythondata-analysisquantilebinning

Binning data in Python


I'm working very hard to understand how to bin data in Python. So far I have worked out how to get the edges using:

edges = pylab.hist(data, bins=10)[1]

I'm not sure if this is the most ideal method, but it worked! Gives me a list of 11 numbers needed to make 10 bins. Problem is that I'm at a loss as how to then classify data into bins. I tried using:

digitized = np.digitize(data, edges)

But that just gave me an error, "ValueError: zero-size array to reduction operation minimum which has no identity". I need to make bins somehow before using pandas value_counts (I have that part down already as well).

Any help would super appreciated!


Solution

  • The answer is:

    digitized = np.digitize(data, edges)