I have a folder in which I have to delete approximately 4,000 .rds
files on a daily basis. The files are no more than a few kilobytes (max size: 73 kb), but every time I try to delete them via R
it can take a while to do (same if I manually delete them). I would like to know if there are alternative methods to delete them much quicker?
What I do to delete files:
# ***********************************************************************
# METHOD # 1 :
# reads all the .rds files from folder
files2 <- list.files(paste("/Volumes/share/ZZZ/GOOGLE1/"))
# I use lapply along with file.remove()
TR <- lapply(as.list(files2),function(x) file.remove(paste0("/Volumes/share/ZZZ/GOOGLE1/",x,"")))
# ***********************************************************************
# METHOD #2 :
do.call(unlink,list(list.files("/Volumes/share/ZZZ/GOOGLE1/",full.names=TRUE)))
# ***********************************************************************
# METHOD # 3 :
unlink("/Volumes/share/ZZZ/GOOGLE1/", recursive=TRUE, force=TRUE)
I tested all 3 methods by deleting 100 files for each method
RESULTS:
METHOD #1 :
user system elapsed
0.014 0.064 44.133
METHOD #2 :
user system elapsed
0.010 0.047 36.447
METHOD #3 :
user system elapsed
0.009 0.057 43.400
sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.5 (El Capitan)
unlink()
accepts wildcards, so you can do the following, which seems quite fast on my system:
system.time({ unlink('*.rds'); }); ## deleted 4000 ~65KB files
## user system elapsed
## 0.140 0.922 1.151
Note that @Thomas's suggestion of using system()
with wait=F
is a good idea, but has several drawbacks: (1) it is platform-dependent, (2) you will not be able to check the return code of the removal command, since it is run asynchronously, and (3) it may introduce a race condition; for example, if subsequent code quickly writes a new *.rds
file, then it could end up being deleted by the asynchronous removal command.