Search code examples
rsplitstackshape

cSplit_e from splitstackshape package not accounting for NA's?


I wanted to follow up on the question that I posted here. While I received baseR and data.table solution, I was trying to implement the same using cSplit_e from splitstackshape package as suggested in the comment of my previous post. With the modified data as below (i.e. with NA),

data1<-structure(list(reason = c("1", "1", NA, "1", "1", "4 5", "1", 
"1", "1", "1", "1", "1 2 3 4", "1 2 5", NA, NA)), .Names = "reason", class = "data.frame", row.names = c(NA, 
-15L))

 #loading packages
 library(data.table)
 library(splitstackshape)

cSplit_e(setDT(data1),1," ",mode = "value") # with NA's doesn't work

Error in seq.default(min(vec), max(vec)) : 'from' must be a finite number

data2<-na.omit(setDT(data1),cols="reason") # removing NA's 

cSplit_e(data2,1," ",mode = "value") # without NA's works
     reason reason_1 reason_2 reason_3 reason_4 reason_5
 1:       1        1       NA       NA       NA       NA
 2:       1        1       NA       NA       NA       NA
 3:       1        1       NA       NA       NA       NA
 4:       1        1       NA       NA       NA       NA
 5:     4 5       NA       NA       NA        4        5
 6:       1        1       NA       NA       NA       NA
 7:       1        1       NA       NA       NA       NA
 8:       1        1       NA       NA       NA       NA
 9:       1        1       NA       NA       NA       NA
10:       1        1       NA       NA       NA       NA
11: 1 2 3 4        1        2        3        4       NA
12:   1 2 5        1        2       NA       NA        5

So, the question is does cSplit_e account for NA's in column to be splited?


Solution

  • This has been fixed in the bugfix release (v1.4.4) of "splitstackshape". Thanks for reporting it.

    After using update.packages(), you should be able to do:

    packageVersion("splitstackshape")
    ## [1] ‘1.4.4’
    
    cSplit_e(data1, 1, " ", mode = "value")
    ##     reason reason_1 reason_2 reason_3 reason_4 reason_5
    ## 1        1        1       NA       NA       NA       NA
    ## 2        1        1       NA       NA       NA       NA
    ## 3     <NA>       NA       NA       NA       NA       NA
    ## 4        1        1       NA       NA       NA       NA
    ## 5        1        1       NA       NA       NA       NA
    ## 6      4 5       NA       NA       NA        4        5
    ## 7        1        1       NA       NA       NA       NA
    ## 8        1        1       NA       NA       NA       NA
    ## 9        1        1       NA       NA       NA       NA
    ## 10       1        1       NA       NA       NA       NA
    ## 11       1        1       NA       NA       NA       NA
    ## 12 1 2 3 4        1        2        3        4       NA
    ## 13   1 2 5        1        2       NA       NA        5
    ## 14    <NA>       NA       NA       NA       NA       NA
    ## 15    <NA>       NA       NA       NA       NA       NA
    

    Note that 1.4.4 has moved "data.table" from "depends" to "imports".