Search code examples
pythonrmatplotliblabelggrepel

pylab: plotting points with colors and labels (IDs, not categories)


I'm trying to plot points with both colors and labels. This is not a classical problem: in fact, typically python users set "labels" as categories. In this case I want that the color represents a feature, while the label is an identifier for the point itself. It follows a toy-example:

x = [-0.01611772,  1.51755901, -0.64869352, -1.80850313, -0.11505037]
y = [ 0.04845168, -0.45576903,  0.62703651, -0.24415787, -0.41307092]

colors = ['b', 'g', 'r', 'b', 'r']
labels = ['Gioele', 'Felix', 'Elpi', 'Roro', 'Cacara']

I'd like to use the function scatter. Following the "quick" documentation:

def scatter(x, y, s=20, c=None, marker='o', cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, edgecolors=None, hold=None, data=None, **kwargs) Inferred type: (x: Any, y: Any, s: int, c: Any, marker: unicode, cmap: Any, norm: Any, vmin: Any, vmax: Any, alpha: Any, linewidths: Any, verts: Any, edgecolors: Any, hold: Any, data: Any, kwargs: dict) -> Any

So, my try was:

import pylab
pylab.scatter(x, y, c=colors, data=labels)
pylab.show()

but it seems ignoring the data=labels part.

In addition: suppose we can plot the labels, is there a way to plot them in a "smart" way, i.e. such that the labels don't hide each other? I would need something similar to the R function ggrepel.


Solution

  • I think using plt.annotate is an option here. To take your example:

    import matplotlib.pyplot as plt
    
    x = [-0.01611772,  1.51755901, -0.64869352, -1.80850313, -0.11505037]
    y = [ 0.04845168, -0.45576903,  0.62703651, -0.24415787, -0.41307092]
    colors = ['b', 'g', 'r', 'b', 'r']
    labels = ['Gioele', 'Felix', 'Elpi', 'Roro', 'Cacara']
    
    plt.scatter(x,y,c=colors)
    for label,xi,yi in zip(labels,x,y):
        plt.annotate(label,xy=(xi,yi),textcoords='offset points',
        ha='left',va='bottom')
    

    This gives the following output:

    enter image description here

    Edit: I just spotted that you also asked about overlapping labels, too. This question seems to have a good solution. There is also apparently a piece of code on github that is designed to emulate ggrepel.