I have multiple Excel sheets of a dataset that I need to align.
Simplifying, it looks like this (this is just a simplification, in fact it has hundreds of columns and rows and there are multiple sheets of data):
As you can see, in the first sheet every patient has age, however in sheet n° 2 only patients 1, 3 and 4 have data on sex (and also those with sex missing are not present in the sheet). The result I want to have is in the third image so that the patients with sex missing are still reported but with blank. So I want to perform the alignment based on PATIENT NUMBER.
I want to know if there are packages or rapid ways to do it.
Merge them with a left join:
library(tidyverse)
a <- tibble(
patient = 1:5,
age = c(22, 30, 31, 50, 60)
)
b <- tibble(
patient = c(1, 3, 4),
sex = c(0, 1, 0)
)
left_join(a, b, join_by(patient))
#> # A tibble: 5 × 3
#> patient age sex
#> <dbl> <dbl> <dbl>
#> 1 1 22 0
#> 2 2 30 NA
#> 3 3 31 1
#> 4 4 50 0
#> 5 5 60 NA
Created on 2023-03-09 with reprex v2.0.2