Search code examples
pythonarraysnumpyhistogramhistogram2d

numpy digitize for 2 dimensional histogram


I am searching for the np.digitize function for bi-dimensional histogram.

From these lines of code:

    H, xedges, yedges = np.histogram2d(coord[:, 0], coord[:, 1])
    H = H.T
    print(H)

I obtain the following histogram:

[[ 7. 20. 16. 14. 10.  8. 16.  7. 10.  7.]
 [11. 11. 10. 10.  5. 10.  9. 12.  7.  7.]
 [13. 11. 13.  9. 13. 10. 14.  6.  9.  9.]
 [ 5.  5.  4.  5.  7. 13. 14. 11.  6. 10.]
 [14.  4. 11.  5.  7. 14.  6. 11. 11.  5.]
 [12.  9.  5.  7.  9. 14. 15. 15. 13. 12.]
 [ 5. 13. 15.  9. 10.  7. 10. 12.  7.  5.]
 [ 4. 10. 15.  7.  6. 10. 13.  5. 12. 12.]
 [12.  6. 11.  8.  5.  5. 13. 14. 13.  9.]
 [10. 11.  9.  8. 18. 13. 16.  8.  8. 13.]]

and I want to find out indices that each element of histogram represents, e.g. (1st row, 1st column -> 7 -> 7 indices should be computed). I spent many hours trying to follow this post, but I get stuck at one place (I can explain where, if no better approach is known). Another potential duplicate is here but this also does not solve the problem.

So does anybody know, how to resolve this issue?

Thank you!


Solution

  • You want scipy.stats.binned_statistic_2d:

    H, xedges, yedges, binnumber = scipy.stats.binned_statistic_2d(
        coord[:, 0], coord[:, 1], None, 'count', expand_binnumbers=True)
    

    The first three return values are the same as your original code. The fourth (binnumber)` is:

    a shape (2,N) ndarray, where each row gives the bin numbers in the corresponding dimension.