Search code examples
rdataframereshapetransformation

R Data transformation/manipulation


I have data in the format:

Sender  Action  Recipient   Operation
Sender1 Update  Recipient3  Operation1
Sender2 Update  Recipient4  Operation2
Sender3 Update  Recipient5  Operation3
Sender1 Update  Recipient6  Operation1
Sender2 Delete  Recipient3  Operation4
Sender3 Delete  Recipient4  Operation5
Sender1 Update  Recipient5  Operation1
Sender2 Delete  Recipient6  Operation4
Sender1 Delete  Recipient3  Operation6

I would like my data to be in the following format, with each Operation featured on one line, and columns updated dynamically based on how many recipients are tied to an operation

Operation   User1   Action    User2       User3      User4
Operation1  Sender1 Update  Recipient3  Recipient6  Recipient5
Operation2  Sender2 Update  Recipient4      
Operation3  Sender3 Update  Recipient5      
Operation4  Sender2 Delete  Recipient3  Recipient6  
Operation5  Sender3 Delete  Recipient4      
Operation6  Sender1 Delete  Recipient3

How do I accomplish this in R?


Solution

  • You can use pivot_wider to get data in wider format.

    library(dplyr)
    
    df %>%
      rename(User1 = Sender) %>%
      group_by(Operation) %>%
      mutate(col = paste0('User', row_number() + 1)) %>%
      tidyr::pivot_wider(names_from = col, values_from = Recipient) %>%
      select(Operation, User1, everything()) -> result
    
    result
    
    #  Operation  User1   Action User2      User3      User4     
    #  <chr>      <chr>   <chr>  <chr>      <chr>      <chr>     
    #1 Operation1 Sender1 Update Recipient3 Recipient6 Recipient5
    #2 Operation2 Sender2 Update Recipient4 NA         NA        
    #3 Operation3 Sender3 Update Recipient5 NA         NA        
    #4 Operation4 Sender2 Delete Recipient3 Recipient6 NA        
    #5 Operation5 Sender3 Delete Recipient4 NA         NA        
    #6 Operation6 Sender1 Delete Recipient3 NA         NA        
    

    data

    df <- structure(list(Sender = c("Sender1", "Sender2", "Sender3", "Sender1", 
    "Sender2", "Sender3", "Sender1", "Sender2", "Sender1"), Action = c("Update", 
    "Update", "Update", "Update", "Delete", "Delete", "Update", "Delete", 
    "Delete"), Recipient = c("Recipient3", "Recipient4", "Recipient5", 
    "Recipient6", "Recipient3", "Recipient4", "Recipient5", "Recipient6", 
    "Recipient3"), Operation = c("Operation1", "Operation2", "Operation3", 
    "Operation1", "Operation4", "Operation5", "Operation1", "Operation4", 
    "Operation6")), class = "data.frame", row.names = c(NA, -9L))