Search code examples
rdataframetransformdata-cleaningreshape2

R transform data frame columns into single column based on column name split


A common data cleaning question, but with a bit of a twist that I'm having trouble executing.

I have a dataset with multiple columns of year.month information like so:

    loc      type  2010.01  2010.02  2010.03 
  Manhattan   a      2300     2300     2500     
  Manhattan   b      2999     2975     2975     

I'd like to transform the data based on the column name, by splitting by year and month at the "."

So the data would look like this:

loc        type  year   month  value
Manhattan   a    2010    01     2300
Manhattan   a    2010    02     2300
Manhattan   a    2010    03     2500
Manhattan   b    2010    01     2999
Manhattan   b    2010    02     2975
Manhattan   b    2010    03     2975

How can I go about this?

I'm thinking of using melt, something like this but I'm pretty new to R and know this isn't correct:

df <- melt(df,id=1,measure=patterns(".",value.name="Value"))

Solution

  • You can achieve this via pivot_longer and separate from tidyr. Try this:

    library(dplyr)
    library(tidyr)
    
    df <- read.table(text = "loc      type  2010.01  2010.02  2010.03 
      Manhattan   a      2300     2300     2500     
      Manhattan   b      2999     2975     2975     ", header = TRUE)
    df
    #>         loc type X2010.01 X2010.02 X2010.03
    #> 1 Manhattan    a     2300     2300     2500
    #> 2 Manhattan    b     2999     2975     2975
    
    df %>% 
      pivot_longer(-c(loc:type)) %>% 
      separate(name, into = c("year", "month")) %>% 
      mutate(year = gsub("X", "", year))
    #> # A tibble: 6 x 5
    #>   loc       type  year  month value
    #>   <fct>     <fct> <chr> <chr> <int>
    #> 1 Manhattan a     2010  01     2300
    #> 2 Manhattan a     2010  02     2300
    #> 3 Manhattan a     2010  03     2500
    #> 4 Manhattan b     2010  01     2999
    #> 5 Manhattan b     2010  02     2975
    #> 6 Manhattan b     2010  03     2975
    

    Created on 2020-04-04 by the reprex package (v0.3.0)