Search code examples
pythonmatplotlibscatter-plot

How to create a scatter plot with yticklabels in a custom order


can I do this and produce a scatter?

y-axis tick/label order → [0, 7, 2, 9, 4, 11, 6, 1, 8, 3, 10, 5]
            
x = ('a', 'b', 'c', 'd', 'e')

y = [[1,5,2], [10,5,11,7], [9], [], [ 7, 2, 9, 4, 11, 6, 1, 8, 3, 10, 5]]

The idea is to map the x-values against the y-values. e.g. the first in the list of data ('a') was, say, hitting 1, 5 and 2 on a target, the second ('b') hit scores 10, 5, 11, 7. I need to keep the Y-axis as labels in the order shown.

sample2


Solution

  • The implementation consists of getting the y-axis ticklabels in the order defined by y_ticks, which requires changing the yticklabels to something that does not match the default ytick coordinates.

    import matplotlib.pyplot as plt
    
    # Define the data we'll work with. 
    x_labels = ("a", "b", "c", "d", "e")
    x_ticks = range(len(x_labels))
    y_ticks = (0, 7, 2, 9, 4, 11, 6, 1, 8, 3, 10, 5)
    # y values to be plotted
    y_lists = ([1, 5 , 2], [10, 5, 11, 7], [9], [], [7, 2, 9, 4, 11, 6, 1, 8, 3, 10, 5])
    
    # Define the figure and ax.
    fig, ax = plt.subplots()
    fig.show()
    
    # Disable y-autoscaling since we are defining the ticks and ticklabels ourselves.
    ax.autoscale(False)
    
    # We can use a bit of custom padding.
    e = 0.05
    
    xmin, xmax = min(x_ticks), max(x_ticks)
    dx = xmax - xmin
    ax.set_xlim(xmin - dx*e, xmax + dx*e)
    
    ymin, ymax = min(y_ticks), max(y_ticks)
    dy = ymax - ymin
    ax.set_ylim(ymin - dy*e, ymax + dy*e)
    
    # Set the sorted y-ticks.
    ax.set_yticks(sorted(y_ticks))
    ax.set_yticklabels(y_ticks)
    
    # Set the x-ticks.
    ax.set_xticks(x_ticks)
    ax.set_xticklabels(x_labels)  # `ax.set_xticklabels("abcde")` would work too.
    
    for x, y_list in zip(x_ticks, y_lists):
        # We have a single x value for each letter, but we need the x-list to be as long y-list in order to make a scatter.
        x_list = [x]*len(y_list) 
        # Notice the use of `.index` here to accommodate the ticks not being ordered.
        true_y_list = [y_ticks.index(y) for y in y_list] 
        ax.scatter(x_list, true_y_list)
    
    

    enter image description here

    For a variant where a dict is used to hold the data and automatically name the x-ticks after its keys, we can replace the first paragraph by this:

    # Define the data we'll work with.
    data = {"a": [1, 5 , 2],
            "b": [10, 5, 11, 7],
            "c": [9],
            "d": [],
            "e": [7, 2, 9, 4, 11, 6, 1, 8, 3, 10, 5]}
    
    x_labels = data.keys()
    x_ticks = range(len(x_labels))
    y_ticks = (0, 7, 2, 9, 4, 11, 6, 1, 8, 3, 10, 5)
    y_lists = data.values()