I am trying to create a new vector "seasons" based off of the months vector already provided in my data. I am using the built-in txhousing dataset; I have already filtered the dataframe to only include information on the city of Houston and have called this new dataframe houston.
I have managed to recategorize the twelve months into four seasons, however, the way I did it is not efficient. Does anyone have any suggestions for how I can optimize this code? Whenever I tried to provide a range of months (e.g. houston[houston$month==(3:5),] %<>% mutate(seasons = "spring") I would get the error "In month == 3:5 : longer object length is not a multiple of shorter object length".
Thank you for any help! -an R newbie
houston[houston$month==(1),] %<>% mutate(seasons = "winter")
houston[houston$month==(2),] %<>% mutate(seasons = "winter")
houston[houston$month==(3),] %<>% mutate(seasons = "spring")
houston[houston$month==(4),] %<>% mutate(seasons = "spring")
houston[houston$month==(5),] %<>% mutate(seasons = "spring")
houston[houston$month==(6),] %<>% mutate(seasons = "summer")
houston[houston$month==(7),] %<>% mutate(seasons = "summer")
houston[houston$month==(8),] %<>% mutate(seasons = "summer")
houston[houston$month==(9),] %<>% mutate(seasons = "summer")
houston[houston$month==(10),] %<>% mutate(seasons = "fall")
houston[houston$month==(11),] %<>% mutate(seasons = "fall")
houston[houston$month==(12),] %<>% mutate(seasons = "winter")
dplyr::case_when
provides a clean coding for this.
library(dplyr)
# Reprex dataframe (always include one in your questions)
houston <- tibble(month = 1:12)
houston %>%
mutate(seasons = case_when(month %in% c(1:2, 12) ~ "winter",
month %in% 3:5 ~ "spring",
month %in% 6:9 ~ "summer",
month %in% 10:11 ~ "fall"))
# A tibble: 12 x 2
month seasons
<int> <chr>
1 1 winter
2 2 winter
3 3 spring
4 4 spring
5 5 spring
6 6 summer
7 7 summer
8 8 summer
9 9 summer
10 10 fall
11 11 fall
12 12 winter