I'm trying to color-code a scatter plot based on the string in a column. I can't figure out how to set up the legend.
Repeatable Example
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
## Dummy Data
x = [0, 0.03, 0.075, 0.108, 0.16, 0.26, 0.37, 0.49, 0.76, 1.05, 1.64,
0.015, 0.04, 0.085, 0.11, 0.165, 0.29, 0.37, 0.6, 0.78, 1.1]
y = [16.13, 0.62, 2.15, 41.083, 59.97, 13.30, 7.36, 6.80, 4.97, 3.53, 11.77,
30.21, 64.47, 57.64, 56.83, 46.69, 4.22, 30.35, 35.12, 5.22, 25.32]
label = ['a', 'a', 'c', 'a', 'c', 'b', 'c', 'c', 'c', 'b', 'c',
'c', 'c', 'a', 'b', 'a', 'a', 'a', 'b', 'c', 'c', 'c']
df = pd.DataFrame(
list(zip(x, y, label)),
columns =['x', 'y', 'label']
)
## Set up colors dictionary
mydict = {'a': 'darkviolet',
'b': 'darkgoldenrod',
'c': 'olive'}
## Plotting
plt.scatter(df.x, df.y, c=df['label'].map(mydict))
plt.legend(loc="upper right", frameon=True)
Current Output
Desired Output
Same plot as above, I just want to define the legend handle.
You will make a list of legend handles as shown below. legendhandle
will take the first element of the list of lines.
import matplotlib.pyplot as plt
import pandas as pd
## Dummy Data
x = [0, 0.03, 0.075, 0.108, 0.16, 0.26, 0.37, 0.49, 0.76, 1.05, 1.64,
0.015, 0.04, 0.085, 0.11, 0.165, 0.29, 0.37, 0.6, 0.78, 1.1]
y = [16.13, 0.62, 2.15, 41.083, 59.97, 13.30, 7.36, 6.80, 4.97, 3.53, 11.77,
30.21, 64.47, 57.64, 56.83, 46.69, 4.22, 30.35, 35.12, 5.22, 25.32]
label = ['a', 'a', 'c', 'a', 'c', 'b', 'c', 'c', 'c', 'b', 'c',
'c', 'c', 'a', 'b', 'a', 'a', 'a', 'b', 'c', 'c', 'c']
df = pd.DataFrame(
list(zip(x, y, label)),
columns =['x', 'y', 'label']
)
## Set up colors dictionary
mydict = {'a': 'darkviolet',
'b': 'darkgoldenrod',
'c': 'olive'}
legendhandle = [plt.plot([], marker="o", ls="", color=color)[0] for color in list(mydict.values())]
plt.scatter(df.x, df.y, c=df['label'].map(mydict))
plt.legend(legendhandle,list(mydict.keys()),loc="upper right", frameon=True)
plt.show()