I have found it challenging to create a treemap using ggplot, and this blog example captured the issues very well and also provided a nice work around. The work around takes the output from the tree map package to create a ggplot version with geom_rect.
My problem and question is how to adjust the labels and, if I wanted to, colors by hierarchy as I have more groups than the linked example and have different labeling requirements.
Here is a reproducible simple example:
library(tidyverse)
library(treemap)
# Create dummy data
tree_data <- data.frame(
my_segment = c(
rep("seg_a", 5),
rep("seg_b", 6),
rep("seg_c", 7)),
my_class = c(
rep("class_1", 2),
rep("class_2", 2),
rep("class_3", 1),
rep("class_4", 2),
rep("class_5", 2),
rep("class_6", 2),
rep("class_7", 1),
rep("class_8", 3),
rep("class_9", 3)),
my_type = c(
rep("type_1", 7),
rep("type_2", 6),
rep("type_3", 5)),
vals = round(runif(18, min = 20, max = 100), 0)
)
Here is the head of the sample dataframe:
my_segment my_class my_type vals
1 seg_a class_1 type_1 86
2 seg_a class_1 type_1 41
3 seg_a class_2 type_1 23
4 seg_a class_2 type_1 79
5 seg_a class_3 type_1 33
6 seg_b class_4 type_1 82
7 seg_b class_4 type_1 85
8 seg_b class_5 type_2 40
9 seg_b class_5 type_2 83
10 seg_b class_6 type_2 69
11 seg_b class_6 type_2 98
12 seg_c class_7 type_2 91
13 seg_c class_8 type_2 33
The tree map package runs fine, but produces unreadable output in RStudio, and I'd like to be able to customize more with ggplot (similar to the linked article)
# Run treemap function
tree_p <- treemap(
tree_data,
index = c("my_segment", "my_class", "my_type"),
vColor = "my_segment",
vSize = "vals",
type = "index",
fontsize.labels = c(15, 12, 10),
fontcolor.labels = c("white", "orange", "green"),
fontface.labels = c(2, 1, 1),
bg.labels = 0,
align.labels = list(
c("center", "center"),
c("right", "bottom"),
c("left", "bottom")
),
overlap.labels = 0.5,
inflate.labels = FALSE
)
# Note: unreadable output in Rstudio (too small)
Using the workaround in this blog, but adding an additional hierarchy and wanted to change the labeling is where the problem comes in.
# Create the plot in ggplot using geom_rect
# Get underlying data created from running treemap
tm_plot_data <- tree_p$tm %>%
mutate(x1 = x0 + w,
y1 = y0 + h) %>%
mutate(x = (x0+x1)/2,
y = (y0+y1)/2) %>%
mutate(
primary_group = case_when(
level == 1 ~ 1.5,
level == 2 ~ 0.75,
TRUE ~ 0.5
)
)
# Plot
ggplot(tm_plot_data, aes(xmin = x0, ymin = y0, xmax = x1, ymax = y1)) +
# add fill and borders for groups and subgroups
geom_rect(aes(fill = color, size = primary_group),
show.legend = FALSE,
color = "black",
alpha = 0.3
) +
scale_fill_identity() +
# set thicker lines for group borders
scale_size(range = range(tm_plot_data$primary_group)) +
# add labels
ggfittext::geom_fit_text(aes(label = my_segment), color = "white", min.size = 1) +
ggfittext::geom_fit_text(aes(label = my_class), color = "blue", min.size = 1) +
ggfittext::geom_fit_text(aes(label = my_type), color = "red", min.size = 1) +
# options
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
theme_void()
So the question I have is there a way to create the labeling like treemap? Specifically, seg_a
, seg_b
, and seg_c
should only appear once, centered over the area of their respective segments. I'd also like to move the labels so that they do not overlap
Thanks for any help and suggestions!
The issue is that you use your full dataset tm_plot_data
to add the labels. Hence, for each upper level you you get multiple labels. To solve this issue aggregate your datasets and pass these datasets as data
to ggfittext::geom_fit_text
. To deal with overlapping labels you could e.g. use the place
argument of ggfittext::geom_fit_text
to move the class labels to the bottom left and the type labels to the topright.
library(tidyverse)
library(treemap)
set.seed(123)
tm_seg <- tm_plot_data %>%
group_by(my_segment) %>%
summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>%
ungroup()
tm_class <- tm_plot_data %>%
group_by(my_segment, my_class) %>%
summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>%
ungroup()
tm_type <- tm_plot_data %>%
group_by(my_segment, my_class, my_type) %>%
summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>%
ungroup()
# Plot
ggplot(tm_plot_data, aes(xmin = x0, ymin = y0, xmax = x1, ymax = y1)) +
# add fill and borders for groups and subgroups
geom_rect(aes(fill = color, size = primary_group),
show.legend = FALSE,
color = "black",
alpha = 0.3
) +
scale_fill_identity() +
# set thicker lines for group borders
scale_size(range = range(tm_plot_data$primary_group)) +
# add labels
ggfittext::geom_fit_text(data = tm_seg, aes(label = my_segment), color = "white", min.size = 4) +
ggfittext::geom_fit_text(data = tm_class, aes(label = my_class), color = "blue", min.size = 1, place = "bottomleft") +
ggfittext::geom_fit_text(data = tm_type, aes(label = my_type), color = "red", min.size = 1, place = "topright") +
# options
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
theme_void()
#> Warning: Removed 3 rows containing missing values (geom_fit_text).
#> Warning: Removed 12 rows containing missing values (geom_fit_text).