I have two disk frame and each are about 20GB worth of files.
It's too big to merge as data tables because the process requires more than the memory I have available. I tried using this code: output <- rbindlist(list(df1, df2))
The wrinkle is that I'd like to also run unique
since there might be dups in my data.
Can I use the same code with rbindlist
on two disk frames?
Yeah. You just do rbindlist.disk.frame(list(df1, df2))
I need to implement bind_rows
at some point too!