Search code examples
rlistmemoryin-place

dropping a list element in-place


Dropping an element form a list via conventional means (for example ll["name"] <- NULL ), causes the entire list to be copied over. Normally, this is not noticable, until of course the data sets become large.

I have a list with a dozen elements each between 0.25 ~ 2 GB in size. Dropping three elements from this list takes about ten minutes to execute (on a relatively fast machine.)

Is there a way to drop elements from a list in-place?


I have tried the following:

TEST <- list(A=1:20,  B=1:5)

TEST[["B"]] <- NULL
TEST["B"] <- NULL
TEST <- TEST[c(TRUE, FALSE)]
data.table::set(TEST, "B", value=NULL) # ERROR

Output with memory info:

cat("\n\n\nATTEMPT 1\n")
TEST <- list(A=1:20,  B=1:5)
.Internal(inspect(TEST))
TEST[["B"]] <- NULL
.Internal(inspect(TEST))

cat("\n\n\nATTEMPT 2\n")
TEST <- list(A=1:20,  B=1:5)
.Internal(inspect(TEST))
TEST["B"] <- NULL
.Internal(inspect(TEST))

cat("\n\n\nATTEMPT 3\n")
TEST <- list(A=1:20,  B=1:5)
.Internal(inspect(TEST))
TEST <- TEST[c(TRUE, FALSE)]

Solution

  • I don't know how you could make a vector shorter without copying it. The next best thing would be to set the element to missing NA or NULL.

    According to ?Extract, you have to specify TEST[i] <- list(NULL) to set an element to NULL. And my tests indicate that i must be an integer or logical vector.

    > TEST <- list(A=1:20,  B=1:5); .Internal(inspect(TEST))
    @27d2c60 19 VECSXP g0c2 [NAM(1),ATT] (len=2, tl=0)
      @27dd9e0 13 INTSXP g0c6 [] (len=20, tl=0) 1,2,3,4,5,...
      @2805c98 13 INTSXP g0c3 [] (len=5, tl=0) 1,2,3,4,5
    ATTRIB:
      @1f38be8 02 LISTSXP g0c0 [] 
        TAG: @d3f478 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "names" (has value)
        @2807430 16 STRSXP g0c2 [] (len=2, tl=0)
          @dc2628 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "A"
          @dc25f8 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "B"
    > TEST[2] <- list(NULL); .Internal(inspect(TEST)); TEST
    @27d2c60 19 VECSXP g0c2 [MARK,NAM(1),ATT] (len=2, tl=0)
      @27dd9e0 13 INTSXP g0c6 [MARK] (len=20, tl=0) 1,2,3,4,5,...
      @d3fb78 00 NILSXP g1c0 [MARK,NAM(2)] 
    ATTRIB:
      @1f38be8 02 LISTSXP g0c0 [MARK] 
        TAG: @d3f478 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "names" (has value)
        @2807430 16 STRSXP g0c2 [MARK] (len=2, tl=0)
          @dc2628 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "A"
          @dc25f8 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "B"
    $A
     [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
    
    $B
    NULL