Search code examples
pythonmatplotlibmarkerscatter

How can I do a matplotlib scatter plot with a categorical x-axis, that allows me to specify the marker and color based on a third variable?


Here is my dataframe: https://i.sstatic.net/WJQWg.png

I need a matplotlib scatter plot that has the movie title as the label on the x-axis, in the order given by the 'Order' column. I also want the color of the markers to be determined by the genre of the movie. How can I do this with Matplotlib?

Note that I would ideally like to use the object-oriented approach to matplotlib - i.e. using:

fig, ax = plt.subplots()

Solution

  • Since you only specify your x-values and not your y-values, it is difficult to know your exact goal. But you can use the following code to create a scatter plot with categorical data. I chose to plot the 'Order' column as y-values, but you can also modify the code to plot something else or reorder you data as you like. So here you go:

    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    
    data = {'Order': np.arange(4), 'Movie Title': ['a', 'b', 'c', 'd'], 'Genre': ['genre2', 'genre1', 'genre1','genre2']}
    
    df = pd.DataFrame.from_dict(data)
    
    fig, ax = plt.subplots()
    ax.scatter(df['Movie Title'], df['Order'],
               c=['blue' if g=='genre1' else 'red' for g in df['Genre']])
    plt.show()
    

    enter image description here