I have a pandas df with x,y coordinates and wanted to know how I can count the number of points in each bin. I know you can visualise this using a plt.hist2d()
but I wanted to make some sort of array/matrix that holds the counts per bin.
Ive binned my x,y coordinates using:
bins = (df // .1 * .1).round(1).stack().groupby(level=0).apply(tuple)
where df
is:
x y
-2.319059 -4.057801
1.514416 -2.325972
-2.642251 -1.004367
-1.486476 -2.535654
-0.844162 -3.078726
-2.376592 -1.471239
-3.139233 0.449457
:
etc
and bins
is:
0 (-2.4, -4.1)
1 (1.5, -2.4)
3 (-2.7, -1.1)
4 (-1.5, -2.6)
6 (-0.9, -3.1)
7 (-2.4, -1.5)
8 (-3.2, 0.4)
:
etc
I tried to make an empty numpy array using:
x_size = int(max(list(df['x'])))
y_size = int(max(list(df['y'])))
my_array = np.zeros((x_size+1,y_size+1), np.int16)
but im not sure how i relate the bin coordinates to the array coordinates in order to count them..
Simply groupby
your bins and use GroupBy.count
method
bins.groupby(bins).count()