Search code examples
rmerge

Combining tables based on elements from one of the columns


This is a bit hard to explain so I'll show an example below, but I basically have two tables and I want to create a new table with a new column by matching what's in the rows of another table. I also want to note that these tables are very large (10,000+ rows).Is there a way to do this in R?

Let's say that these are the tables that I have

Table 1-sample
obs      code   genSym   
14       E444     5     
28       3330     6        
33       7555     9      
35       E777     10     

Table 2-description
row_id  code   type   
1       E444    INS             
3       7555    MMS      
5       5679    SMO 
6       A678    LLX 
7       A900    IRR
8       3330    DEL
9       4545    SLL  
10      E777    FOO

and I want it to look like this

obs      code   genSym  type 
14       E444     5      INS
28       3330     6      DEL  
33       7555     9      MMS
35       E777     10     FOO 

My first table is my sample table and my second table is my description table that has all of the descriptions of what the codes mean. I want to match the "type" in my first table based on the codes in the first table.


Solution

  • You need to do a left join, which can be accomplished multiple ways. This is a way to do it using the tidyverse:

    library(tidyverse)
    
    table_1 = tibble(
      obs = c(14, 28, 33, 35),
      code = c("E444", "3330", "7555", "E777"),
      genSym = c(5, 6, 9, 10)
    )
    
    table_2 = tibble(
      code = c("E444", "7555", "5679", "A678", "A900", "3330", "4545", "E777"),
      type =c("INS", "MMS", "SMO", "LLX", "IRR", "DEL", "SLL", "FOO")
    )
    
    table_1 %>% left_join(table_2)
    Joining, by = "code"
    # A tibble: 4 x 4
        obs code  genSym type 
      <dbl> <chr>  <dbl> <chr>
    1    14 E444       5 INS  
    2    28 3330       6 DEL  
    3    33 7555       9 MMS  
    4    35 E777      10 FOO