A common data cleaning question, but with a bit of a twist that I'm having trouble executing.
I have a dataset with multiple columns of year.month information like so:
loc type 2010.01 2010.02 2010.03
Manhattan a 2300 2300 2500
Manhattan b 2999 2975 2975
I'd like to transform the data based on the column name, by splitting by year and month at the "."
So the data would look like this:
loc type year month value
Manhattan a 2010 01 2300
Manhattan a 2010 02 2300
Manhattan a 2010 03 2500
Manhattan b 2010 01 2999
Manhattan b 2010 02 2975
Manhattan b 2010 03 2975
How can I go about this?
I'm thinking of using melt, something like this but I'm pretty new to R and know this isn't correct:
df <- melt(df,id=1,measure=patterns(".",value.name="Value"))
You can achieve this via pivot_longer
and separate
from tidyr
. Try this:
library(dplyr)
library(tidyr)
df <- read.table(text = "loc type 2010.01 2010.02 2010.03
Manhattan a 2300 2300 2500
Manhattan b 2999 2975 2975 ", header = TRUE)
df
#> loc type X2010.01 X2010.02 X2010.03
#> 1 Manhattan a 2300 2300 2500
#> 2 Manhattan b 2999 2975 2975
df %>%
pivot_longer(-c(loc:type)) %>%
separate(name, into = c("year", "month")) %>%
mutate(year = gsub("X", "", year))
#> # A tibble: 6 x 5
#> loc type year month value
#> <fct> <fct> <chr> <chr> <int>
#> 1 Manhattan a 2010 01 2300
#> 2 Manhattan a 2010 02 2300
#> 3 Manhattan a 2010 03 2500
#> 4 Manhattan b 2010 01 2999
#> 5 Manhattan b 2010 02 2975
#> 6 Manhattan b 2010 03 2975
Created on 2020-04-04 by the reprex package (v0.3.0)