Search code examples
pythonpandasdataframeplotscatter

Scatter plot multiple columns from dataframe python


  • I have a dataframe (97 columns x 30 rows). In this dataframe there are only 1 and 0.
  • I want to plot it like a scatter plot, in which in the x axis the are the name of the columns and in the y axis the name of the indexes.

[my dataframe is like this][1]

  • The output I want is similar to the photo, but the red dot must be there only if the value of the intersection between row and columns has a 1 value.
  • If there is a 0 value nothing is plot in the intersection.[][2][the output scatter plot I want][3]
  1. https://i.sstatic.net/hFnQX.png
  2. https://i.sstatic.net/Rsguk.jpg
  3. https://i.sstatic.net/keGC6.png

Solution

  • A straightforward way to do this is to use two nested loops for plotting the points conditionally on each dataframe cell:

    import pandas as pd
    import matplotlib.pyplot as plt
    
    example = pd.DataFrame({'column 1': [0, 1, 0, 1], 
                            'column 2': [1, 0, 1, 0],
                            'column 3': [1, 1, 0, 0]})
    
    for x, col in enumerate(example.columns):
        for y, ind in enumerate(example.index):
            if example.loc[ind, col]:
                plt.plot(x, y, 'o', color='red')
                
    plt.xticks(range(len(example.columns)), labels=example.columns)
    plt.yticks(range(len(example)), labels=example.index)
        
    plt.show()
    

    example plot