Search code examples
pythonmatplotlibbar-chartseabornpie-chart

How to set same colors for same indexes in different charts in matplotlib and seaborn


I'm trying to plot a pandas data frame on two plots. One with a pie chart of matplotlib pyplot and the other with seaborn barchart. On each chart, I sorted the data frame but based on a different column. Also, each chart is representative of the respective value which data frame is sorted by. So, the order of rows is different in the two charts. This way, same indexes(or category) in data frame appears with different color on charts which is confusing. How can I fix this in order to have indexes with the same color on different charts?

My code:

df = pd.DataFrame({"Total":totals,"Infected": infected}, 
                   index=category).sort_values("Total", ascending=False)
fig, ax = plt.subplots(ncols=2, nrows=1,figsize=(20,8))

#creating a pie chart with conditional explode option
threshold = new_train.shape[0]*threshold

if explode==-1:
    ax[0].pie(df[df["Total"]>threshold]["Total"], 
              labels=df[df["Total"]>threshold].index.values, 
              autopct='%1.1f%%',shadow=False, startangle=rotation,  
              textprops={'fontsize': 15})
else:
    ax[0].pie(df[df["Total"]>threshold]["Total"], 
              labels=df[df["Total"]>threshold].index.values, 
              autopct='%1.1f%%',shadow=False, startangle=rotation, 
              textprops={'fontsize': 15}, explode=explode)
ax[0].axis('equal')
ax[0].set_title(col_name)

#created a sorted bar chart
newdf = df[df["Total"]>threshold]
newdf.sort_values("Infected", ascending=False, inplace=True)
ax[1].set_xticklabels(category,rotation=45, horizontalalignment='right')
ax[1].set_title('Infected fractions')
ax[1] = sns.barplot(x=newdf.index, y="Infected",data=newdf, 
order=newdf.index)#, orient='h')
plt.show()

enter image description here

For example, 1.1.15200.1 has blue color in pie chart but in bar chart its color is orange.


Solution

  • You will have to make a dictionary that looks up the appropriate color for a value of the label you are using. Below is a simplified version that hopefully covers all the things you are doing to your data:

    # a dataframe from a list of lists 
    el = [['RED', 50, 1], 
          ['GREEN', 30, 2], 
          ['BLUE', 7, 3], 
          ['YELLOW', 3, 4], 
          ['ORANGE', 9, 5], 
          ['BLACK', 1, 6]]
    df = pd.DataFrame(el)
    
    # since you are not using the entire dataframe
    df_thres = df[df[1] > 1]
    
    # make a fixed length color map manually
    c = ['red', 'green', 'blue', 'yellow', 'orange']
    clist1 = {i:j for i, j in zip(df_thres[0].values, c)}
    
    # make an arbitrary-length colormap
    cm = plt.get_cmap('rainbow')
    c = [cm(1.0 * i/len(df_thres)) for i in range(len(df_thres))]
    clist2 = {i:j for i, j in zip(df_thres[0].values, c)}
    
    
    fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(11, 8))
    
    # manual colors
    
    # PIE PLOT: sort on the third column
    newdf = df_thres.sort_values(2, ascending=False)
    ax[0, 0].pie(newdf[2], labels=newdf[0], colors=[clist1[i] for i in newdf[0].values])
    ax[0, 0].axis('square') # matplotlib 2.x only
    
    # BAR PLOT: sort on the second column    
    newdf = df_thres.sort_values(1, ascending=False)
    ax[0, 1].bar(newdf[0], newdf[1], color=[clist1[i] for i in newdf[0].values])
    
    #----------------
    
    # arbitrary-length generated colorlist
    newdf = df_thres.sort_values(2, ascending=False)
    ax[1, 0].pie(newdf[1], labels=newdf[0], colors=[clist2[i] for i in newdf[0].values])
    ax[1, 0].axis('square') # matplotlib 2.x only
    
    newdf = df_thres.sort_values(1, ascending=False)
    ax[1, 1].bar(newdf[0], newdf[1], color=[clist2[i] for i in newdf[0].values])
    

    This is the output I get. The colors on the lower row are not the labeled ones, but they are consistent. (Also, note that I am using matplotlib for both plots here)

    enter image description here