I'm trying to create a 3D histogram of some data I have but I think there must be something wrong with how I'm defining the bins because I'm getting a lot of empty marginal histograms. The following is how I'm creating the histogram and how I'm checking the values --
# Hard boundaries for the cube
param1_range = [0, 6]
param2_range = [-2, +0.5]
param3_range = [0, 2]
d_param1 = 0.5
param1_bins = np.arange(param1_range[0], param1_range[1], d_param1)
N_param1_bins = len(d_param1) - 1
d_param2 = 0.5
param2_bins = np.arange(param2_range[0], param2_range[1], d_param2)
N_param2_bins = len(d_param2) - 1
d_param3 = 0.25
param3_bins = np.arange(param3_range[0], param3_range[1], d_param3)
N_param3_bins = len(d_param3) - 1
empty_param1, empty_param2, empty_param3 = 0, 0, 0
cube_full = np.zeros((Nobs, N_param1_bins, N_param2_bins, N_param3_bins))
for i in range(Nobs):
tmp = np.vstack((param1_data[i,:], param2_data[i,:], param3_data[i,:])).T
hist, _ = np.histogramdd(tmp[:,:],
bins=[param1_bins, param2_bins, param3_bins],
density=True)
cube_full[i,:,:,:] = hist
if np.all(hist[:,0,0] == 0.):
empty_param1 = empty_param1 + 1
if np.all(hist[0,:,0] == 0.):
empty_param2 = empty_param2 + 1
if np.all(hist[0,0,:] == 0.):
empty_param3 = empty_param3 + 1
print(empty_param1/Nobs, empty_param2/Nobs, empty_param3/Nobs)
About half of the marginal histograms for "param1" are empty and nearly all for "param2" and "param3".
I determined the parameter ranges from looking at my data. I also created 1D histograms for the different parameters using these same ranges and don't get empty bins like when I try to make the 3D histogram. Is there some additional consideration that I'm missing when defining 3D bins versus 1D bins?
I'm interpreting here, but from the comments, I THINK what you want is this:
sum1 = np.sum(hist, (1,2))
if np.all(sum1) == 0:
empty_param1 += 1
sum2 = np.sum(hist, (0,2))
sum3 = np.sum(hist, (0,1))
This essentially produces the 1D histograms from the 3D data. If I've still missed the mark, maybe you can provide a example using smaller matrices.
Example:
>>> x = np.arange(125).reshape((5,5,5))
>>> x
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[ 10, 11, 12, 13, 14],
[ 15, 16, 17, 18, 19],
[ 20, 21, 22, 23, 24]],
[[ 25, 26, 27, 28, 29],
[ 30, 31, 32, 33, 34],
[ 35, 36, 37, 38, 39],
[ 40, 41, 42, 43, 44],
[ 45, 46, 47, 48, 49]],
[[ 50, 51, 52, 53, 54],
[ 55, 56, 57, 58, 59],
[ 60, 61, 62, 63, 64],
[ 65, 66, 67, 68, 69],
[ 70, 71, 72, 73, 74]],
[[ 75, 76, 77, 78, 79],
[ 80, 81, 82, 83, 84],
[ 85, 86, 87, 88, 89],
[ 90, 91, 92, 93, 94],
[ 95, 96, 97, 98, 99]],
[[100, 101, 102, 103, 104],
[105, 106, 107, 108, 109],
[110, 111, 112, 113, 114],
[115, 116, 117, 118, 119],
[120, 121, 122, 123, 124]]])
>>> np.sum(x,(1,2))
array([ 300, 925, 1550, 2175, 2800])
>>> np.sum(x,(0,2))
array([1300, 1425, 1550, 1675, 1800])
>>> np.sum(x,(0,1))
array([1500, 1525, 1550, 1575, 1600])
>>>