Search code examples
rtidyr

reordering my reshape: long to wide with pivot_wider, different column order


I need to reshape a long data set (df below) to wide, where multiple variables are the same across long entries for a given ID, and others change by row. The dummy data is as follows:

ID = c("A", "A", "B", "B", "B", "C", "C")
Name = c("mary", "mary", "berry", "berry", "berry", "paul", "paul")
Set = c("set1", "set2", "set1", "set2", "set3", "set1", "set2")
Street = c("123 St", "234 St", "543 St", "492 st", "231 st", "492 st", "231 st")
State = c("al", "nc", "fl", "ca", "md", "tx", "vt")

df = data.frame(ID, Name, Set, Street, State)

I used pivot_wider to reshape it, but I feel like it's out of order from what I want. Since the actual data has 20 Sets for each entry and 7 Variables for each set, is there an easy way to do this while it's reshaping?

It looks like this:

test <- pivot_wider(df, names_from = c("Set"), values_from = c("Street", "State"))
test
# A tibble: 3 x 8
  ID    Name  Street_set1 Street_set2 Street_set3 State_set1 State_set2 State_set3
  <chr> <chr> <chr>       <chr>       <chr>       <chr>      <chr>      <chr>     
1 A     mary  123 St      234 St      NA          al         nc         NA        
2 B     berry 543 St      492 st      231 st      fl         ca         md        
3 C     paul  492 st      231 st      NA          tx         vt         NA        

But what I want is for it to look like this:

  ID  Name Set1_Street Set1_State Set2_Street Set2_State Set3_Street Set3State
1  A  mary     123 St        al     234 St        nc       <NA>      <NA>
2  B berry     543 St        fl     492 st        fl    231 st         md
3  C  paul     492 st        tx     231 st        vt       <NA>      <NA>

I'd also really love your opinions on which option (reshape, spread) is better for large datasets if you have thoughts on that!

Edit: left out the pivot_wider command I used, fixed! Yikes


Solution

  • It may be easier with names_glue in pivot_wider

    library(dplyr)
    library(tidyr)
    df %>% 
       pivot_wider(names_from = Set, values_from = c(Street, State), 
           names_glue = "{tools::toTitleCase(Set)}_{.value}") %>%   
       dplyr::select(ID, Name, order(readr::parse_number(names(.)[-(1:2)])) + 2)
    

    -output

    # A tibble: 3 × 8
      ID    Name  Set1_Street Set1_State Set2_Street Set2_State Set3_Street Set3_State
      <chr> <chr> <chr>       <chr>      <chr>       <chr>      <chr>       <chr>     
    1 A     mary  123 St      al         234 St      nc         <NA>        <NA>      
    2 B     berry 543 St      fl         492 st      ca         231 st      md        
    3 C     paul  492 st      tx         231 st      vt         <NA>        <NA>