Search code examples
raccumulatespreadwrangle

I want to make tidy data


I want to make spread data based on duplicate data. I want to accumulate column based on overlapping rows. For example, X2 column is repeated 5 times.

It looks similar to using spread function in tidyr package, but it's a little different. Could you give some idea for me?

INPUT:

  X1       X2       X3       
1 A         1       10                        
2 B         2       11              
3 C         3       13             
4 A         4       12            
5 F         5       16    
6 D         1       17
            .
            .
            .
515 E       5       18

OUTPUT:

  X2     X1.1   X3.1   X1.2   X3.2     ... 102 steps
1  1     A      10     D      17
2  2     B      11
3  3     C      13
4  4     A      12 
5  5     F      16

Solution

  • With dplyr and tidyr we can group by X2 and build tibbles that we'll unnest into the desired format.

    I modified your data slightly to have a full example :

    library(tidyverse)
    df1 %>% 
      group_by(X2) %>%
      summarize_all(~as.list(.) %>%
                      setNames(seq_along(.)) %>%
                      as_tibble %>% 
                      list) %>%
      unnest(.sep = ".")
    
    # # A tibble: 3 x 5
    #      X2  X1.1  X1.2  X3.1  X3.2
    #   <int> <chr> <chr> <int> <int>
    # 1     1     A     A    10    12
    # 2     2     B     F    11    16
    # 3     3     C     D    13    17
    

    And a legit tidy way :

    df1 %>%
      mutate(i = cumsum(X2 == 1)) %>%
      gather(key, value, X1,X3) %>%
      unite(key,key,i, sep=".") %>%
      spread(key, value)
    
    #   X2 X1.1 X1.2 X3.1 X3.2
    # 1  1    A    A   10   12
    # 2  2    B    F   11   16
    # 3  3    C    D   13   17
    

    data

    df1 <- read.table(header=TRUE,stringsAsFactors=FALSE,text="X1       X2       X3       
    1 A         1       10                        
    2 B         2       11              
    3 C         3       13             
    4 A         1       12            
    5 F         2       16    
    6 D         3       17")