Search code examples
rdplyrspread

Using spread for more than one value


I am coding in R and have a slight issue with reshaping my data model. This is what my data table is similar to:

data sample

name      job                company    number
John      Product Manager       Google       1
Sam       software developer    Microsoft    1
Sam       Product Manager       Microsoft    2
Matt      hr director           Chevron      1
Cassy     Head of Investments   Apple        1
Cassy     CEO                   JP Morgan    2
Cassy     CFO                   Amazon       3

This is what I am trying to make my data look like:

name      job_1                 company_1       job_2         company_2     job_3   company_3
John      Product Manager       Google       
Sam       software developer    Microsoft    Product Manager  Microsoft
Matt      hr director           Chevron      
Cassy     Head of Investments   Apple         CEO             JP Morgan      CFO     Amazon
Cassy     CFO                   Amazon       

I tried using spread() but it would not let me put more than variable in the value = section and I cannot concatenate the values into a string and use a sep() as the actual data file will take a few days to run. Is there any way to do this?


Solution

  • spread has been replaced with pivot_wider where you can pass multiple values columns.

    tidyr::pivot_wider(df, names_from = number, values_from = c(job, company))
    
    # A tibble: 4 x 7
    #  name  job_1               job_2           job_3 company_1 company_2 company_3
    #  <chr> <chr>               <chr>           <chr> <chr>     <chr>     <chr>    
    #1 John  Product_Manager     NA              NA    Google    NA        NA       
    #2 Sam   software_developer  Product_Manager NA    Microsoft Microsoft NA       
    #3 Matt  hr_director         NA              NA    Chevron   NA        NA       
    #4 Cassy Head_of_Investments CEO             CFO   Apple     JP_Morgan Amazon   
    

    You can also use data.table dcast

    library(data.table)
    dcast(setDT(df), name~number, value.var = c("job", "company"))