According to Pandas manual, the parameter Colormap can be used to select colors from matplotlib colormap object. However for each bar, in the case of a bar diagram, the color needs to be selected manually. This is not capable, if you have a lot of bars, the manual effort is annoying. My expectation is that if no color is selected, each object/class should get a unique color representation. Unfortunately, this is not the case. The colors are repetitive. Only 10 unique colors are provided.
Code for reproduction:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(100, 25)), columns=list('ABCDEFGHIJKLMNOPQRSTUVWXY'))
df.set_index('A', inplace=True)
df.plot(kind='bar', stacked=True, figsize=(20, 10))
plt.title("some_name")
plt.savefig("some_name" + '.png')
Does somebody have any idea how to get a unique color for each class in the diagram? Thanks in advance
That's probably because the colors in the default property cycle (see image below) are only number of 10.
A workaround would be to set a list of random colors (in your case, 24) and pass it as a kwarg to pandas.DataFrame.bar
:
import random
list_colors= ["#"+"".join([random.choice("0123456789ABCDEF") for j in range(6)])
for i in range(len(df.columns))]
df.plot(kind="bar", stacked=True, figsize=(20, 10), color=list_colors)
Update :
It might be hard to find a palette of very distinct 24 colors. However, you can use one of the palettes available in seaborn :
import seaborn as sns #pip install seaborn
list_colors = sns.color_palette("hsv", n_colors=24)
df.plot(kind="bar", stacked=True, figsize=(20, 10), color=list_colors)
Another solution would be to use scipy.spatial.distance.euclidean
from the beautiful scipy :
from scipy.spatial import distance #pip install scipy
def hex_to_rgb(hex_color):
return tuple(int(hex_color[i:i+2], 16) for i in (1, 3, 5))
def distinct_colors(n):
colors = []
while len(colors) < n:
color = "#" + "".join(random.choice("0123456789ABCDEF") for _ in range(6))
if all(distance.euclidean(hex_to_rgb(color), hex_to_rgb(c)) > 50 for c in colors):
colors.append(color)
return colors
colors = distinct_colors(len(df.columns)) #len(df.columns)=24
sns.palplot(colors)