Search code examples
pythonnumpymatplotlibhistogram2d

Please explain in detail 2D Histogram in Python


I am trying to understand what are the values of a 2D histogram.

I have 2 numpy arrays of the same length X and Y (float numbers in each one).

For example the first 10 values of X: [ 88, 193, 60, 98, 78, 100, 75, 76, 130]

and Y: [ 18. , 9. , 36.1, 18.5, 34.3, 32.9, 32.2, 22. , 15. ]

When I use:

import matplotlib.pyplot as plt

plt.hist2d(X,Y, bins=(10,20)) 

I get a 2D histogram.

But what does it mean?

1D histogram simply shows me how much of each item I have.

Please explain me what does it mean in 2D.

Thanks in advance!


Solution

  • Suppose you have a 1D array, you plot the position of its values on the x axis, they are so dense that you can't tell the spatial distribution, you use a 1D histogram to show the distribution by count of boxes along the x axis. Problem solved.

    Then you have two 1D arrays, a list of 2D dots in (x, y) axes. You plot their positions on the x-y plane, again they are so dense and overlap with each other. You want to view the distribution better by count of boxes in the plane, so you try a 2D diagram. Problem solved.

    Here is an example

    import numpy as np
    import matplotlib.pyplot as plt
    
    %matplotlib inline
    
    # prepare 2D random dots centered at (0, 0)
    n = 100000
    x = np.random.randn(n)
    y = x + np.random.randn(n)
    
    # plot data
    fig1 = plt.figure()
    plt.plot(x,y,'.r')
    plt.xlabel('x')
    plt.ylabel('y')
    

    gives

    enter image description here

    # plot 2D histogram using pcolor
    fig2 = plt.figure()
    plt.hist2d(x, y, bins=100)
    plt.xlabel('x')
    plt.ylabel('y')
    cbar = plt.colorbar()
    cbar.ax.set_ylabel('Counts')
    

    gives

    enter image description here