I am making a dendrogram plot and a scatterplot from the same data, both plotted with ggplot2
. I use as.dendrogram()
to convert the hclust()
object into a form suitable for ggplot()
, then set branch and label colors using the set()
function from dendextend
. I use metaMDS()
and scores()
from vegan
for the MNDS results. I color the points in the NMDS scatterplot based on groups created by dendextend::cutree()
. I use the same color palette to color the branches and labels in dendrogram.
However, the colors of the groups do not match between the two plots. I assumed because I applied the same colors to the dendrogram and scatterplot, and used the groups from cutree
to inform the scatterplot that I would get the same color groups.
How can I ensure that both plots use the same colors for the same groups?
Notice in the images below that Milan, Barcelona, and Marseille (for one example) are one color in the dendrogram and another color in the scatterplot. In the dendrogram, Barcelona is assigned the fourth color from the palette but it is assigned to group 2 from cutree and thus gets the second color from the palette.
Dendrogram
NMDS scatterplot
MWE
library(RColorBrewer)
library(dendextend)
library(dplyr)
library(ggplot2)
library(vegan)
kay <- 5
mycolors <- brewer.pal(kay, "Dark2")
euro.hc <- hclust(eurodist)
euro.cut <- dendextend::cutree(euro.hc, k = kay)
euro.nmds <- metaMDS(eurodist, k = kay)
euro.df <- scores(euro.nmds, display = "sites", tidy = TRUE) %>%
mutate(grp = as.factor(euro.cut))
dend <- as.dendrogram(euro.hc) %>%
set("branches_k_color",
value = mycolors,
k = kay
) %>%
set("labels_colors",
value = mycolors,
k = kay
) %>%
set("branches_lwd", 1.0) %>%
set("labels_cex", 1)
dend %>% ggplot(horiz = TRUE) +
scale_x_continuous(expand = c(-1, -1)) +
scale_y_reverse(expand = c(1, 1)) +
theme(
axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank()
)
euro.df %>%
ggplot() +
geom_point(aes(x = NMDS1, y = NMDS2, color = grp)) +
geom_text(aes(x = NMDS1, y = NMDS2, label = label, color = grp),
vjust = -1,
hjust = .50
) +
scale_colour_manual(values = mycolors, guide = NULL) +
coord_equal() +
theme_minimal() +
theme(
line = element_blank(),
axis.text = element_blank()
)
Created on 2023-09-25 with reprex v2.0.2
After much trial and error, I finally figured out a solution. It works well but seems "hacky" so I would like to learn if there is a better/more efficient method.
Here's a summary of the changes. The full working code and output are included below.
order_clusters_as_data = FALSE
in cutree()
cutree()
and the branch and label colors via get_leaves_branches_col
aes()
and take it from the colr column of the dataframe.library(RColorBrewer)
library(dendextend)
library(dplyr)
library(ggplot2)
library(vegan)
kay <- 5
mycolors <- brewer.pal(kay, "Dark2")
euro.hc <- hclust(eurodist)
euro.cut <- dendextend::cutree(euro.hc, k = kay, order_clusters_as_data = FALSE)
euro.nmds <- metaMDS(eurodist, k = kay)
dend <- as.dendrogram(euro.hc) %>%
set("branches_k_color",
value = mycolors,
k = kay
) %>%
set("labels_colors",
value = mycolors,
k = kay
) %>%
set("branches_lwd", 1.0) %>%
set("labels_cex", 1)
label <- names(euro.cut)
colr <- get_leaves_branches_col(dend)
tmp.df <- data.frame(label = label, colr = colr) #order = order
euro.df <- scores(euro.nmds, display = "sites", tidy = TRUE) %>%
left_join(x = ., y = tmp.df, by = "label")
dend %>% ggplot(horiz = TRUE) +
scale_x_continuous(expand = c(-1, -1)) +
scale_y_reverse(expand = c(1, 1)) +
theme(
axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank()
)
euro.df %>%
ggplot() +
geom_point(aes(x = NMDS1, y = NMDS2),
color = euro.df$colr) +
geom_text(aes(x = NMDS1, y = NMDS2, label = label),
color = euro.df$colr,
vjust = -1,
hjust = .50
) +
scale_colour_manual(values = mycolors, guide = NULL) +
coord_equal() +
theme_minimal() +
theme(
line = element_blank(),
axis.text = element_blank()
)
Created on 2023-09-26 with reprex v2.0.2