Search code examples
rappenddump

Append = TRUE option of the R dump() function


For a simulation study I would like to save the results from one sample in an R file and then append this file with the results from the next sample.

To achieve this, I am using the function dump(). To append the file with data from the next simulation, I want to use the append = TRUE option of this function. However, this function is not working for me.

When I simulate data, save it as an R file (by using the dump function) and I do that again with the append = TRUE option the dump function overwrites the data and does not append the file.

What am I doing wrong?

To illustrate the problem, this is my example code:

#Simulate data 
x <- rnorm(10)
y <- rnorm(10)
xy <- data.frame(x,y)

#Dump into R file "xy.R" with option append = TRUE
dump("xy",file="xy.txt", append =TRUE)
rm(xy)  # remove the dataset form current environment

#Retrieve data from file: 
source("xy.R")
xy #100 rows

#Run the code again: Still 100 rows and not 200 as expected. 
#Old data is overwritten, new data is not appended.

Solution

  • Did you actually look at the output file?

    Dump writes the variable name with the assignment to the file. That means when you ask for dump("xy") it will write out

    xy <- ...
    

    and when you run it again, it will append the output to the same file, so it will end up writing

    xy <- ...
    xy <- ...
    

    So you've just defined the variable xy twice and the last value wins. The append option will not append data to every element on the file, it just adds more text to the end of the file. You need to read/merge the data yourself before you dump it again.

    Though, if you are dumping a data.frame, you probably would be better of using write.table and using append=T and read.table. That's more likely to be the correct behavior for you. Something like

    #Simulate data 
    x <- rnorm(10)
    y <- rnorm(10)
    xy <- data.frame(x,y)
    
    fn<-"xy.txt"
    if(file.exists(fn)) {
        xy <- rbind(read.table(fn), xy)
    }
    write.table(xy,file=fn)
    rm(xy)  # remove the dataset form current environment
    
    #Retrieve data from file: 
    xy <- read.table(fn)
    

    Or perhaps

    #Simulate data 
    x <- rnorm(10)
    y <- rnorm(10)
    xy <- data.frame(x,y)
    
    fn<-"xy.txt"
    if(file.exists(fn)) {
        write.table(xy, file=fn, row.names=F, col.names=F, append=T)
    } else {
        write.table(xy, file=fn, row.names=F, col.names=T)
    }
    rm(xy)  # remove the dataset form current environment
    
    #Retrieve data from file: 
    xy <- read.table(fn, header=T)