Search code examples
rstringlistnested-listsstrsplit

R strsplit, nested lists blues


I am facing this issue in R in which I want to split the strings on comma and then further split on semicolon, but only keep the first item before the semicolon i.e. ee and jj below. I have tried a bunch of things but nested lists seem too convoluted!

Here's what I am doing:

d <- c("aa,bb,cc,dd,ee;e,ff",
       "gg,hh,ii,jj;j")

e=strsplit(d,",")
myfun2 <- function(x,arg1) {
 strsplit(x,";")
}
f=lapply(e,myfun2)
f=
 [[1]]
 [[1]][[1]]
 [1] "aa"

 [[1]][[2]]
 [1] "bb"

 [[1]][[3]]
 [1] "cc"

 [[1]][[4]]
 [1] "dd"

 [[1]][[5]]
 [1] "ee" "e" 

 [[1]][[6]]
 [1] "ff"

 [[2]]
 [[2]][[1]]
 [1] "gg"

 [[2]][[2]]
 [1] "hh"

 [[2]][[3]]
 [1] "ii"

 [[2]][[4]]
 [1] "jj" "j" 

Here's the output that I want

Correct output=
[[1]]
[1] "aa" "bb" "cc" "dd" "ee" "ff"

[[2]]
[1] "gg" "hh" "ii" "jj"

I have tried a bunch of things using lapply to the nested list "f" and used "[[" and "[" but with no success.

Any help is greatly appreciated. (I know that I am missing something silly, but just can't figure it out right now!)


Solution

  • This is your code

    d <- c("aa,bb,cc,dd,ee;e,ff", "gg,hh,ii,jj;j")
    e <- strsplit(d,",")
    myfun2 <- function(x,arg1) { strsplit(x,";") }
    f <- lapply(e,myfun2)
    

    If we start from your f, then the next step would be

    lapply(f,function(x) mapply(`[`,x,1))
    
    [[1]]
    [1] "aa" "bb" "cc" "dd" "ee" "ff"
    
    [[2]]
    [1] "gg" "hh" "ii" "jj"
    

    Basically, you need an inner and outer type apply function to go down the two levels of nesting.