I have a list of z
points associated to pairs x,y
, meaning that for example
x y z
3.1 5.2 1.3
4.2 2.3 9.3
5.6 9.8 3.5
and so on. The total number of z
values is relatively high, around 10000.
I would like to bin my data, in the following sense:
1) I would like to split the x
and y
values into cells, so as to make a 2-dimensional grid in x,y
.If I have Nx
cells for the x
axis and Ny
for the y
axis, I would then have Nx*Ny
cells on the grid. For example, the first bin for x
could be ranging from 1. to 2., the second from 2. to 3. and so on.
2) For each of this cell in the 2dimensional grid, I would then need to calculate how many points fall into that cell, and sum all their z
values. This gives me a numerical value associated to each cell.
I thought about using binned_statistic
from scipy.stats
, but I would have no idea on how to set the options to accomplish my task. Any suggestions? Also other tools, other than binned_statistic
, are well accepted.
Assuming I understand, you can get what you need by exploiting the expand_binnumbers parameter for binned_statistic_2d, thus.
from scipy.stats import binned_statistic_2d
import numpy as np
x = [0.1, 0.1, 0.1, 0.6]
y = [2.1, 2.6, 2.1, 2.1]
z = [2.,3.,5.,7.]
binx = [0.0, 0.5, 1.0]
biny = [2.0, 2.5, 3.0]
ret = binned_statistic_2d(x, y, None, 'count', bins=[binx,biny], \
expand_binnumbers=True)
print (ret.statistic)
print (ret.binnumber)
sums = np.zeros([-1+len(binx), -1+len(biny)])
for i in range(len(x)):
m = ret.binnumber [0][i] - 1
n = ret.binnumber [1][i] - 1
sums[m][n] += sums[m][n] + z[i]
print (sums)
This is just an expansion of one of the examples. Here's the output.
[[ 2. 1.]
[ 1. 0.]]
[[1 1 1 2]
[1 2 1 1]]
[[ 9. 3.]
[ 7. 0.]]