I have a data frame with 717 unique rows, what I want to do now is, assuming the reference and joining MeSH ID are different, create new rows below where I move the reference ID into the same column as the Joining MeSH ID, deleting the reference MeSH ID column as I do so but keeping all other data in the row the same. So if a row in the table currently looks something like this:
Disease | Joining Mesh ID | Company | Mode of Action | Reference MeSH ID |
---|---|---|---|---|
Acute Myeloid Leukemia | D015470 | Gilead | CD3 agonist | D007951 |
I want the final product to look something like this:
Disease | Mesh ID | Company | Mode of Action | |
---|---|---|---|---|
Acute Myeloid Leukemia | D015470 | Gilead | CD3 agonist | |
Acute Myeloid Leukemia | D007951 | Gilead | CD3 agonist |
(There are 25 columns in total, but I've only shown a few for the sake of simplicity.)
As always, any help is greatly appreciated!
df <- read.table(header = T, text = "Disease Joining_Mesh_ID Company Mode_of_Action Reference_MeSH_ID
'Acute Myeloid Leukemia' D015470 Gilead 'CD3 agonist' D007951")
library(tidyverse)
df %>% pivot_longer(ends_with('_ID'), names_to = NULL, values_to = 'Mesh_ID')
#> # A tibble: 2 x 4
#> Disease Company Mode_of_Action Mesh_ID
#> <chr> <chr> <chr> <chr>
#> 1 Acute Myeloid Leukemia Gilead CD3 agonist D015470
#> 2 Acute Myeloid Leukemia Gilead CD3 agonist D007951
Created on 2021-07-04 by the reprex package (v2.0.0)