I have a dataframe with too many variables on the x-axis so I would like to introduce breaks in my x-axis labels and change those labels based on another column. I've found a solution here is-it-possible-to-have-a-continuous-line-with-geom-line-across-facets-with-facete which works when I set breaks =1 but when I try to add multiple breaks I get an error:
Below is modified from the linked example.
library(patchwork)
library(ggplot2)
library(scales)
df_graph_data = data.frame( year = c(
rep.int("2020", times = 11),
rep.int("2021", times = 12),
rep.int("2022", times = 3) ), month_name = c(
"Feburary", "March", "April", "May", "June", "July",
"August", "September", "October", "November", "December",
"January", "Feburary", "March", "April", "May", "June", "July",
"August", "September", "October", "November", "December",
"January", "Feburary", "March" ), month_number = c(
"02", "03", "04", "05", "06", "07",
"08", "09", "10", "11", "12", "01",
"02", "03", "04", "05", "06", "07",
"08", "09", "10", "11", "12", "01",
"02", "03" ), number_of_queries = c(
484819, 576697, 843015, 925175,
1102853, 889212, 835706, 774622,
701338, 850297, 1046064, 1273363,
958868, 1088284, 1151606, 1666950,
2025731, 2731704, 2429019, 3228395,
3204915, 2612807, 2811946, 3053788,
2589273, 2305433 ) )
df_graph_data$rownum = 1:nrow(df_graph_data)
windows()
graph <- ggplot(df_graph_data) + geom_line(aes(x = rownum,
y = number_of_queries), size = 1, colour = "blue", linetype =
"solid" ) + scale_x_continuous(
breaks = seq(
min(df_graph_data$rownum),
max(df_graph_data$rownum),
by = 1
),
labels = df_graph_data$month_number )
graph
This produces this graph
The data set I have is much larger to I would need breaks = 10, but when I try this I get the following error: breaks
and labels
must have the same length.
I would like to find out if there is a way to introduce breaks based on one column and then change the label based on a corresponding column. So for example if the breaks show rownum 10, 20, 30 then the label should be the month_name that corresponds to that rownum
The idea of breaks
and labels
is rather straight forward: place label[i]
at position breaks[i]
.
If you want to space your labels further apart, you can use for instance this snippet:
brk_idx <- seq(
min(df_graph_data$rownum),
max(df_graph_data$rownum),
by = 10
)
ggplot(df_graph_data) +
geom_line(aes(x = rownum,
y = number_of_queries),
linewidth = 1, colour = "blue",
linetype = "solid") +
scale_x_continuous(
breaks = df_graph_data$rownum[brk_idx],
labels = df_graph_data$month_number[brk_idx])
What it basically does is to look up the rows given by brk_idx
and take rownum
as position and month_number
as label at this position:
df_graph_data[brk_idx, c("rownum", "month_number")]
# rownum month_number
# 1 1 02
# 11 11 12
# 21 21 10
That is place "02"
at position 1
, "12"
at position 11
and "10"
at position 21
. (N.B.
brk_idx
and df_graph_data$rownum[brk_idx]
are the very same here)
This explains your error by the way, when you canged the by
argeument in seq
to 10. You wanted to place all month_number
s at positions 1
, 11
and 21
so you had 25
labels but only 3 positions.