Search code examples
rsplit-apply-combine

Combine column in data.frame by the same symbols


I want to combine a list of data.frame by the same symbols in text. Here is my data:

d1 <- data.frame(Name = c("aaa", "bbb", "ccc","ddd","ggg", "eee"), ID = c("123", "456", "789", "101112", "131415", "161718"), stringsAsFactors = FALSE)

d2 <- data.frame(Code = c("123.aR16", "456d245", "14asadf789", "123_dy6r", "202122-fsd", "101112gh"), CupCake = c("a1", "a2", "a3", "a4", "a5", "a6"), stringsAsFactors = FALSE)

If Code contains the same combinations of numbers from ID -> add new value with data from Name.

It looks like copy-paste value from Name.

Expected output:

  Name     ID       Code CupCake
1  aaa    123   123.aR16      a1
2  bbb    456    456d245      a2
3  ccc    789 14asadf789      a3
4  aaa    123   123_dy6r      a4
5   NA     NA 202122-fsd      a5
6  ddd 101112   101112gh      a6

Solution

  • Using tidyverse packages:

    library(dplyr)
    library(stringr)
    
    # Create ID in d2
    d2 <- mutate(d2, ID = str_extract(Code, "([0-9]+)"))
    
    # Merge d1 and d2 based on ID
    df <- full_join(d1, d2, by= "ID")
    
    # Edit: if you only want one row per ID
    df1 <- inner_join(d1, d2, by= "ID")
    
    # Or
    df2 <- inner_join(d2, d1, by= "ID")