Search code examples
rdatabasedataframealignmenttext-alignment

Align dataframes in R


I have multiple Excel sheets of a dataset that I need to align. Simplifying, it looks like this (this is just a simplification, in fact it has hundreds of columns and rows and there are multiple sheets of data): enter image description here As you can see, in the first sheet every patient has age, however in sheet n° 2 only patients 1, 3 and 4 have data on sex (and also those with sex missing are not present in the sheet). The result I want to have is in the third image so that the patients with sex missing are still reported but with blank. So I want to perform the alignment based on PATIENT NUMBER. I want to know if there are packages or rapid ways to do it.


Solution

  • Merge them with a left join:

    library(tidyverse)
    
    a <- tibble(
      patient = 1:5,
      age = c(22, 30, 31, 50, 60)
    )
    
    b <- tibble(
      patient = c(1, 3, 4),
      sex = c(0, 1, 0)
    )
    
    left_join(a, b, join_by(patient))
    #> # A tibble: 5 × 3
    #>   patient   age   sex
    #>     <dbl> <dbl> <dbl>
    #> 1       1    22     0
    #> 2       2    30    NA
    #> 3       3    31     1
    #> 4       4    50     0
    #> 5       5    60    NA
    

    Created on 2023-03-09 with reprex v2.0.2