Search code examples
rregexdataframestrsplit

Split comma separated pattern from data frame in R


I have a dataset like that:

Old <- data.frame(
  X1= c(
    "AD=17795,54;ARL=139;DEA=20;DER=20;DP=1785",
    "DP=4784;AD=4753,23;ARL=123;DEA=5;DER=5",
    "ARL=149;AD=30,9;DEA=25;DER=25;DP=3077",
    "AD=244,49;ARL=144;DEA=7;DER=7;DP=245"
    ))


X1
AD=17795,54;ARL=139;DEA=20;DER=20;DP=1785
DP=4784;AD=4753,23;ARL=123;DEA=5;DER=5
ARL=149;AD=30,9;DEA=25;DER=25;DP=3077
AD=244,49;ARL=144;DEA=7;DER=7;DP=245 

I want to extract ";" seperated value for AD=xxx,xx than add to a new column: Desired output is:

X1                                              X2
AD=17795,54;ARL=139;DEA=20;DER=20;DP=1785       17795,54
DP=4784;AD=4753,23;ARL=123;DEA=5;DER=5          4753,23
ARL=149;AD=30,9;DEA=25;DER=25;DP=3077           30,9
AD=244,49;ARL=144;DEA=7;DER=7;DP=245            244,49

I have tried that:

Old$X2<-mapply(
  function(x,  i) x[i],
  strsplit(X1, ";"),
  lapply(strsplit(X1, ";"), function(x) which(x == "AD="))
)

Solution

  • We can use sub

    sub(".*AD\\=(\\d+,\\d+);.*", "\\1", Old$X1)