Search code examples
rggplot2visualizationgeom

Calendar plot with geom_segment()


I have the following dataset in R. I aim to do a ggplot where the scale goes from 1 to 12 (January, February, ..., December) in the x-axis, and the y-axis goes from 1 to 6 ( num_months variable [in the example only 1 and 6]). Then, I want to use geom_segment(), with the minimum being start_month and the maximum being end_month (so they represent the num_months). I want to facet horizontally by the variable year.

My main problems, so far, are:

  1. I Want the geom_segment to occupy the "full month/s", so if the starting and ending month is 5, meaning May, I want it to start in May and end at the beginning of June (6);
  2. There are several segments with the same duration (num_months), but I want them to be arranged parallel so they don't overlap and show what I want it to show.
  3. I want the num_months to look more like panels because it is confusing in terms of data visualization. Now I did some boxes but some lines are out of the num_months it is and the lines go beyond so they are not boxes.
data <- read_csv("num_months,start_month_year,end_month_year,B1,B1_p,year,start_month,end_month
1,6,6,3.3571016788482666,0.007681768853217363,2021,5,5
1,8,8,2.548985481262207,0.007373321335762739,2021,7,7
1,10,10,2.139772415161133,0.03452971577644348,2021,9,9
1,12,12,2.165775775909424,0.07796278595924377,2021,11,11
1,13,13,1.9506219625473022,0.09215697646141052,2021,12,12
1,23,23,2.7839596271514893,0.011407249607145786,2022,10,10
1,25,25,2.220555543899536,0.06181173026561737,2022,12,12
6,6,11,0.9881601333618164,0.08719704300165176,2021,5,10
6,8,13,1.438501238822937,0.032221969217061996,2021,7,12
6,9,14,1.16400945186615,0.09187468141317368,2021,8,1
6,10,15,1.5834165811538696,0.03494146466255188,2021,9,2
6,11,16,1.294316291809082,0.09792502969503403,2021,10,3
6,12,17,1.4204859733581543,0.0546354204416275,2021,11,4
6,20,25,1.07038414478302,0.0722803920507431,2022,7,12") %>%
  mutate(
    end_month = ifelse(start_month == end_month, end_month + 1, end_month),
    end_month = ifelse(end_month > 12, 1, end_month)  # Wrap around to January if end_month exceeds 12
  ) %>%
  group_by(year, num_months) %>%
  mutate(
    y_pos = num_months + (row_number() - 1) * 0.2  # Adding a systematic offset to y position
  ) %>%
  ungroup()

# Create the boxes for num_months
boxes <- data %>%
  group_by(year, num_months) %>%
  summarise(
    ymin = min(y_pos) - 0.3,
    ymax = max(y_pos) + 0.3
  ) %>%
  ungroup()

# Create the ggplot
p <- ggplot(data) +
  geom_rect(data = boxes, aes(xmin = 0.5, xmax = 12.5, ymin = ymin, ymax = ymax), fill = NA, color = "grey") +
  geom_segment(aes(x = start_month, xend = end_month, y = y_pos, yend = y_pos, color = as.factor(num_months)), size = 1) +
  scale_x_continuous(breaks = 1:12, limits = c(0.5, 12.5), labels = month.abb) +
  scale_y_continuous(breaks = 1:6, limits = c(0.5, 6.5), expand = expansion(mult = c(0.02, 0.1))) +  # Adjusting y-axis limits to accommodate offset
  facet_wrap(~ year) +
  labs(x = "Month", y = "Number of Months", color = "Number of Months") +
  theme_minimal() +
  theme(panel.spacing = unit(1, "lines"))  # Increase spacing between panels

print(p)

Here's how it looks: segments that last the same number of months overlap. Also, lines go into the panel for different num_months.

enter image description here


Solution

  • Here's my suggestion. The big changes are:

    • I use actual dates, not numbers, for the x-axis, with the start date at the start of the month and the end date the last day of the month. This makes the segments "occupy the full months".

    • Since you want the num_months to look "more like panels", I include them in the faceting. (Note that you can facet by more than one variable in the rows, so if you also want to facet country by rows you can do that too, see the "Margins" example at the bottom of the ?facet_grid help page.)

    • Since we have faceted by num_months, this lets us use the grouped row numbers as the y aesthetic, evenly spacing the lines regardless of how many there are.

    Since theme_minimal() doesn't draw panels for its facets, I switched to theme_bw(), but you can of course customize the theming however you want.

    data <- read_csv("num_months,start_month_year,end_month_year,B1,B1_p,year,start_month,end_month
    1,6,6,3.3571016788482666,0.007681768853217363,2021,5,5
    1,8,8,2.548985481262207,0.007373321335762739,2021,7,7
    1,10,10,2.139772415161133,0.03452971577644348,2021,9,9
    1,12,12,2.165775775909424,0.07796278595924377,2021,11,11
    1,13,13,1.9506219625473022,0.09215697646141052,2021,12,12
    1,23,23,2.7839596271514893,0.011407249607145786,2022,10,10
    1,25,25,2.220555543899536,0.06181173026561737,2022,12,12
    6,6,11,0.9881601333618164,0.08719704300165176,2021,5,10
    6,8,13,1.438501238822937,0.032221969217061996,2021,7,12
    6,9,14,1.16400945186615,0.09187468141317368,2021,8,1
    6,10,15,1.5834165811538696,0.03494146466255188,2021,9,2
    6,11,16,1.294316291809082,0.09792502969503403,2021,10,3
    6,12,17,1.4204859733581543,0.0546354204416275,2021,11,4
    6,20,25,1.07038414478302,0.0722803920507431,2022,7,12") %>%
      mutate(
        start_dt = ymd(paste("2023", start_month, "01", sep = "-")),
        end_dt = ceiling_date(ymd(paste("2023", end_month, "01", sep = "-")), unit = "month") - 1,
        end_month = ifelse(start_month == end_month, end_month + 1, end_month),
        end_month = ifelse(end_month > 12, 1, end_month)  # Wrap around to January if end_month exceeds 12
      ) %>%
      group_by() %>%
      mutate(
        y_pos = num_months + (row_number() - 1) * 0.2,  # Adding a systematic offset to y position
        yy = row_number(),
        .by = c(year, num_months)
      )
    
    
    ggplot(data) +
      geom_segment(aes(x = start_dt, xend = end_dt, y = yy, yend = yy, color = factor(num_months)), size = 1) +
      scale_x_date(
        date_labels = "%b", 
        date_breaks = "1 month",
        limits = ymd(c("2023-01-01", "2023-12-31")),
        expand = expansion(0, 0)
      ) +
      scale_y_continuous(labels = NULL) +
      facet_grid(rows = vars(num_months), cols = vars(year), space = "free_y", scales = "free_y") +
      labs(x = "Month", y = "Number of Months", color = "Number of Months") +
      theme_bw() +
      theme(
        panel.spacing = unit(1, "lines"),  # Increase spacing between panels
        panel.grid.major.y = element_blank(),
        axis.ticks.y = element_blank()
      )
    

    enter image description here