I am interested in plotting a legend in my scatterplot. My current code looks like this
x=[1,2,3,4]
y=[5,6,7,8]
classes = [2,4,4,2]
plt.scatter(x, y, c=classes, label=classes)
plt.legend()
The problem is that when the plot is created, the legend is shown as an array instead of showing the unique labels and their classes.
I am aware this is a question discussed previously in threads such as this one, however, I feel my problem is simpler, and the solution there does not fit. Also, in that example the person is specifying the colors, however, in my case, I do know beforehand how many colors I'll need. Moreover, in this example, the user is creating multiple scatters, each one with a unique color. Again, this is not what I want. My goal is to simply create the plot using an x, y
array, and the labels. Is this possible?
Actually both linked questions provide a way how to achieve the desired result.
The easiest method is to create as many scatter plots as unique classes exist and give each a single color and legend entry.
import matplotlib.pyplot as plt
x=[1,2,3,4]
y=[5,6,7,8]
classes = [2,4,4,2]
unique = list(set(classes))
colors = [plt.cm.jet(float(i)/max(unique)) for i in unique]
for i, u in enumerate(unique):
xi = [x[j] for j in range(len(x)) if classes[j] == u]
yi = [y[j] for j in range(len(x)) if classes[j] == u]
plt.scatter(xi, yi, c=colors[i], label=str(u))
plt.legend()
plt.show()
In case the classes are string labels, the solution would look slightly different, in that you need to get the colors from their index instead of using the classes themselves.
import numpy as np
import matplotlib.pyplot as plt
x=[1,2,3,4]
y=[5,6,7,8]
classes = ['X','Y','Z','X']
unique = np.unique(classes)
colors = [plt.cm.jet(i/float(len(unique)-1)) for i in range(len(unique))]
for i, u in enumerate(unique):
xi = [x[j] for j in range(len(x)) if classes[j] == u]
yi = [y[j] for j in range(len(x)) if classes[j] == u]
plt.scatter(xi, yi, c=colors[i], label=str(u))
plt.legend()
plt.show()