I created a heatmap based on spearman's correlation matrix using seaborn clustermap as folowing: I want to paint the dendrogram. I want the dendrogram to look like this: dendrogram but on the heatmap
I created a dict of colors as folowing and got an error:
def assign_tree_colour(name,val_dict,coding_names_df):
ret = None
if val_dict.get(name, '') == 'Group 1':
ret = "(0,0.9,0.4)" #green
elif val_dict.get(name, '') == 'Group 2':
ret = "(0.6,0.1,0)" #red
elif val_dict.get(name, '') == 'Group 3':
ret = "(0.3,0.8,1)" #light blue
elif val_dict.get(name, '') == 'Group 4':
ret = "(0.4,0.1,1)" #purple
elif val_dict.get(name, '') == 'Group 5':
ret = "(1,0.9,0.1)" #yellow
elif val_dict.get(name, '') == 'Group 6':
ret = "(0,0,0)" #black
else:
ret = "(0,0,0)" #black
return ret
def fix_string(str):
return str.replace('"', '')
external_data3 = [list(z) for z in coding_names_df.values]
external_data3 = {fix_string(z[0]): z[3] for z in external_data3}
tree_label = list(df.index)
tree_label = [fix_string(x) for x in tree_label]
tree_labels = { j : tree_label[j] for j in range(0, len(tree_label) ) }
tree_colour = [assign_tree_colour(label, external_data3, coding_names_df) for label in tree_labels]
tree_colors = { i : tree_colour[i] for i in range(0, len(tree_colour) ) }
sns.set(color_codes=True)
sns.set(font_scale=1)
g = sns.clustermap(df, cmap="bwr",
vmin=-1, vmax=1,
yticklabels=1, xticklabels=1,
cbar_kws={"ticks":[-1,-0.5,0,0.5,1]},
figsize=(13,13),
row_colors=row_colors,
col_colors=col_colors,
method='average',
metric='correlation',
tree_kws=dict(colors=tree_colors))
g.ax_heatmap.set_xlabel('Genus')
g.ax_heatmap.set_ylabel('Genus')
for label in Group.unique():
g.ax_col_dendrogram.bar(0, 0, color=lut[label],
label=label, linewidth=0)
g.ax_col_dendrogram.legend(loc=9, ncol=7, bbox_to_anchor=(0.26, 0., 0.5, 1.5))
ax=g.ax_heatmap
File "<ipython-input-64-4bc6be89afe3>", line 11, in <module>
tree_kws=dict(colors=tree_colors))
File "C:\Users\rotemb\AppData\Local\Continuum\anaconda3\lib\site-packages\seaborn\matrix.py", line 1391, in clustermap
tree_kws=tree_kws, **kwargs)
File "C:\Users\rotemb\AppData\Local\Continuum\anaconda3\lib\site-packages\seaborn\matrix.py", line 1208, in plot
tree_kws=tree_kws)
File "C:\Users\rotemb\AppData\Local\Continuum\anaconda3\lib\site-packages\seaborn\matrix.py", line 1054, in plot_dendrograms
tree_kws=tree_kws
File "C:\Users\rotemb\AppData\Local\Continuum\anaconda3\lib\site-packages\seaborn\matrix.py", line 776, in dendrogram
return plotter.plot(ax=ax, tree_kws=tree_kws)
File "C:\Users\rotemb\AppData\Local\Continuum\anaconda3\lib\site-packages\seaborn\matrix.py", line 692, in plot
**tree_kws)
File "C:\Users\rotemb\AppData\Local\Continuum\anaconda3\lib\site-packages\matplotlib\collections.py", line 1316, in __init__
colors = mcolors.to_rgba_array(colors)
File "C:\Users\rotemb\AppData\Local\Continuum\anaconda3\lib\site-packages\matplotlib\colors.py", line 294, in to_rgba_array
result[i] = to_rgba(cc, alpha)
File "C:\Users\rotemb\AppData\Local\Continuum\anaconda3\lib\site-packages\matplotlib\colors.py", line 177, in to_rgba
rgba = _to_rgba_no_colorcycle(c, alpha)
File "C:\Users\rotemb\AppData\Local\Continuum\anaconda3\lib\site-packages\matplotlib\colors.py", line 240, in _to_rgba_no_colorcycle
raise ValueError("Invalid RGBA argument: {!r}".format(orig_c))
ValueError: Invalid RGBA argument: 0
Any help on this would be greatly appreciated! Tnx!
According to sns.clustermap
documentation, the dendrogram coloring can be set through tree_kws
(takes a dict) and its colors
attribute which expects a list of RGB tuples such as (0.5, 0.5, 1)
. It seems also that colors
supports nothing except RGB tuple format data.
Did you notice that clustermap
supports nested lists or data frames for hierarchical colorbars in between dendrograms and the correlation matrix? They could be useful if the dendrograms get too crowded.
I hope this helps!
The list of RGB is the sequence of line colors in LineCollection
— it uses the sequence as it draws each line in both dendrograms. (The order seems that the order starts from the rightmost branch of the column dendrogram) In order to associate a certain label with a data point, you need to figure out the drawing order of data points in dendrograms.
Here's a minimal example for coloring the tree based on sns.clustermap
examples:
import matplotlib.pyplot as plt
import seaborn as sns; sns.set(color_codes=True)
import pandas as pd
iris = sns.load_dataset("iris")
species = iris.pop("species")
g = sns.clustermap(iris)
lut = dict(zip(species.unique(), "rbg"))
row_colors = species.map(lut)
# For demonstrating the hierarchical sidebar coloring
df_colors = pd.DataFrame(data={'r': row_colors[row_colors == 'r'], 'g': row_colors[row_colors == 'g'], 'b': row_colors[row_colors == 'b']})
# Simple class RGBA colormap
colmap = {'setosa': (1, 0, 0, 0.7), 'virginica': (0, 1, 0, 0.7), 'versicolor': (0, 0, 1, 0.7)}
g = sns.clustermap(iris, row_colors=df_colors, tree_kws={'colors':[colmap[s] for s in species]})
plt.savefig('clustermap.png')
As you can see, the order of the drawn lines of the tree start from the upper right corner of the image thus not being tied to the order of the data points visualized in clustermap. On the other hand, the color bars (controlled by
{row,col}_colors
attributes) could be used for that purpose.