Search code examples
pythonpandasmatplotlibseabornscatter-plot

Plotting scatter plot of pandas dataframe with both categorical and numerical data


I am trying to plot a scatter plot of the following type of pandas dataframe:

df = pd.DataFrame([['RH1', 1, 3], ['RH2', 0, 3], ['RH3', 2, 0], ['RH4', 1, 2], columns=['name', 'A', 'B'])

The final plot should have "name" column as Y axis and "A" and "B" as X axis. And the different numerical values with different colours. something like this enter image description here

I tried to plot it by looping over each row of the dataframe but I got stuck at some place and couldn't do it, the main problem I encounter is the size of both the axis. It would be really great if anyone can help me. Thank you in advance.


Solution

  • You can melt your dataframe and use the values as the column for color:

    from matplotlib import pyplot as plt
    import pandas as pd
    
    df = pd.DataFrame([['RH1', 1, 3], ['RH2', 0, 3], ['RH3', 2, 0], ['RH4', 1, 2]], columns=['name', 'A', 'B'])
    
    df.melt(["name"]).plot(x="variable", y= "name", kind="scatter", c="value", cmap="plasma")
    plt.show()
    

    Sample output: enter image description here

    If you have a limited number of values, you can change the colormap to a discrete colormap and label each color with its value. Alternatively, use seaborn's stripplot:

    from matplotlib import pyplot as plt
    import pandas as pd
    import seaborn as sns
    
    df = pd.DataFrame([['RH1', 1, 3], ['RH2', 0, 3], ['RH3', 2, 0], ['RH4', 1, 2]], columns=['name', 'A', 'B'])
    
    sns.stripplot(data=df.melt(["name"]), x="variable", y= "name", hue="value", jitter=False)
    plt.show()
    

    Output: enter image description here