Search code examples
pythonpandasnumpymatplotlibbinning

how do I count the number of points in each bin?


I have a pandas df with x,y coordinates and wanted to know how I can count the number of points in each bin. I know you can visualise this using a plt.hist2d() but I wanted to make some sort of array/matrix that holds the counts per bin.

Ive binned my x,y coordinates using: bins = (df // .1 * .1).round(1).stack().groupby(level=0).apply(tuple) where df is:

     x         y
-2.319059 -4.057801
1.514416 -2.325972
-2.642251 -1.004367
-1.486476 -2.535654
-0.844162 -3.078726
-2.376592 -1.471239
-3.139233  0.449457
:
etc

and bins is:

0       (-2.4, -4.1)
1        (1.5, -2.4)
3       (-2.7, -1.1)
4       (-1.5, -2.6)
6       (-0.9, -3.1)
7       (-2.4, -1.5)
8        (-3.2, 0.4)
:
etc

I tried to make an empty numpy array using:

x_size = int(max(list(df['x'])))
y_size = int(max(list(df['y'])))
my_array = np.zeros((x_size+1,y_size+1), np.int16)

but im not sure how i relate the bin coordinates to the array coordinates in order to count them..


Solution

  • Simply groupby your bins and use GroupBy.count method

    bins.groupby(bins).count()