I am trying to keep my plots with consistent variable colours while using the factoextra
library to plot PCA results. Reproducible example below:
data("decathlon2")
df <- decathlon2[1:23, 1:10]
library("FactoMineR")
res.pca <- PCA(df, graph = FALSE)
get_eig(res.pca)
# Contributions of variables to PC1
fviz_contrib(res.pca, choice = "var", axes = 1, top = 10)
# Contributions of variables to PC2
fviz_contrib(res.pca, choice = "var", axes = 2, top = 10)
I would like the plot for PC1 and PC2 to have a color palette with 10 colours which is identical across plots (i.e. x100m will be red in both). However, in my actual data-set I have 15 explanatory variables which seems to be above the limit for color brewer so there are 2 problems:
Thank you in advance.
(I assume you already know you need to add fill = "name"
to the fviz_contrib()
call; otherwise the bars will default to fill = "steelblue"
.)
You can define the palette manually, such that each variable corresponds to the same colour.
To simulate the problem using the example in the question, suppose we only want to show the top 7, when there are 10 variables all together:
# naive way with 7-color palette applied to different variables
fviz_contrib(res.pca, choice = "var", fill = "name", color = "black", axes = 1, top = 7)
fviz_contrib(res.pca, choice = "var", fill = "name", color = "black", axes = 2, top = 7)
We can create a palette using hue_pal()
from the scales
package, for 10 different colours (one for each column of df
).
(You can also use palettes such as rainbow()
/ heat.colors()
/ etc. from the base grDevices
package. I find their default colour range to be rather intense, though, with a tendency to be overly glaring for a bar chart.)
mypalette <- scales::hue_pal()(ncol(df))
names(mypalette) <- colnames(df)
# optional: see what each color corresponds to
ggplot(data.frame(x = names(mypalette),
y = 1,
fill = mypalette)) +
geom_tile(aes(x = x, y = y, fill = fill), color = "black") +
scale_fill_identity() +
coord_equal()
Use scale_fill_manual()
with the self-defined palette on each chart:
fviz_contrib(res.pca, choice = "var", fill = "name", color = "black", axes = 1, top = 7) +
scale_fill_manual(values = mypalette)
fviz_contrib(res.pca, choice = "var", fill = "name", color = "black", axes = 2, top = 7) +
scale_fill_manual(values = mypalette)