I'm trying to use OHDSI:s version of the SelfControlledCaseSeries
package, which utilizes the ff
package to handle big data. But something is not working with the ffwhich
function. Running the following example, provided in the ffwhich
documentation:
install.packages("ff")
install.packages("ffbase")
x <- ff::ff(10:1)
idx <- ffbase::ffwhich(x, x < 5)
gives me
Error in if (by < 1) stop("'by' must be > 0") :
missing value where TRUE/FALSE needed
In addition: Warning message:
In chunk.default(from = 1L, to = 5L, by = c(integer = 46116860184273880), :
NAs introduced by coercion to integer range
I have tried setting batchbytes
to something smaller, running the script on another computer and also changed the location where ff-files are stored, but the error remains.
options("ffbatchbytes"= getOption("ffmaxbytes")/2)
options(fftempdir="C:/Users/OskarG/Desktop/ff_files")
Any ideas on how to fix this?
A similar error was reported on the package's git hub. Appears to be an issue with operating system (Windows 10?). @jwijffels provides the reason in the comments:
Haven't got windows 10 machine myself but the problem clearly comes from ff::chunk, namely from ff::chunk.ff_vector which is defined as follows
The relevant part is this: b <- BATCHBYTES%/%RECORDBYTES. This calculation apparently on your machine gives 23058430092136940 for reasons beyond my understanding (given that you report it works on Rgui but not on RStudio).
You could probably get around on this by changing option ffbatchbytes to something like this options(ffbatchbytes = 84882227) - which is the number I have on my oldskool windows 7
I was able to reproduce your error and correct it using the above suggestion:
library("ff")
library("ffbase")
options(ffbatchbytes = 84882227) #add this line in
x <- ff::ff(10:1)
idx <- ffwhich(x, x < 5)
x[idx][]
[1] 4 3 2 1 #output