Search code examples
rindexingspread

Spreading a character column by time and id r


Question in style with many others, but still different. I often see people asking for a way to spread a column into several, but it is often in a df where there is a measurement for each name in the column.

Like this:

head(df)
         id time   fish weight
        1  1    1 marlin      4
        2  1    1    cod      1
        3  1    2    cod      1
        4  2    1 salmon      2
        5  2    1    cod      2
        6  2    2    cod      3

So I can use spread like this (or dcast or similar:

df<-spread(df, fish,weight, fill=F)
   id time cod marlin salmon
1  1    1   1      4   <NA>
2  1    2   1   <NA>   <NA>
3  2    1   2   <NA>      2
4  2    2   3   <NA>   <NA>

But what if you do not have a value (here weight) for the variable, but just want to spread the types of fish? So output was like this

  id time   Fish1      Fish2
   1    1   marlin    salmon
   1    2   cod         <NA>
   2    1   salmon       cod
   2    2   cod         <NA>

how do you do that? Thank you for any help. It is greatly appreciated.


Solution

  • We need a group by sequence

    df %>%
      select(-weight) %>%
      group_by(id, time) %>% 
      mutate(ind = paste0("Fish", row_number())) %>%
      spread(ind, fish)
    # A tibble: 4 x 4
    # Groups:   id, time [4]
    #     id  time Fish1  Fish2
    #  <int> <int> <chr>  <chr>
    #1     1     1 marlin cod  
    #2     1     2 cod    NA   
    #3     2     1 salmon cod  
    #4     2     2 cod    NA   
    

    data

    df <- structure(list(id = c(1L, 1L, 1L, 2L, 2L, 2L), time = c(1L, 1L, 
    2L, 1L, 1L, 2L), fish = c("marlin", "cod", "cod", "salmon", "cod", 
    "cod"), weight = c(4L, 1L, 1L, 2L, 2L, 3L)), .Names = c("id", 
    "time", "fish", "weight"), class = "data.frame", row.names = c("1", 
    "2", "3", "4", "5", "6"))