Search code examples
rpackagebuildingecdf

how an ecdf object can be used inside an R package?


I am trying to build an R package and it has several ecdf objects in it. They have been created by ecdf(some variables such as p1). I put them in a list and save that as rda file in data folder, however when I run the function inside the package (suppose named b1) after installing the package I got the following error:

b1(zzz[1,]) (zzz is a data frame and I ran the function on one row of it)
Error in fc(p1) : could not find function ".approxfun" 

fc is a ecdf function stored in the saved list, I called the list by data(list1), and then fc<-list[[1]], inside the function.

I also did data(list1), and when I typed: fc<-list1[[1]], I can see fc as an ecdf object, but when I type fc(1), I got the following error:

Error in fc(1) : could not find function ".approxfun"

If I give the choice to R to choose fc as a function or data, when I used package.skeleton and put fc in mylist, it chooses as a function, and creates fc.R, but it does not run, something like this will be saved in fc.R:

fc <-
structure(function (v) 
.approxfun(x, y, v, method, yleft, yright, f), class = c("ecdf", 
"stepfun", "function"), call = quote(ecdf(yyy$p1)))

however the object fc is:

Empirical CDF:    4825 unique values with summary <br/>
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. <br/>
0.01227 0.03857 0.05602 0.10730 0.15250 0.68020 <br/>

I really appreciate any help that I can get. Thank you very much for your time in advance. I think I need to figure it out that how should I save fc for the building package. The stats library version is 3.0.1 (I thought that could be the reason, but do not know.)


Solution

  • Are you sure the R versions are compatible here? The internal parts of ecdf() and approxfun() objects {yes, they are functions} have been changed relatively recently and now go via the .approxfun(..) wrapper which is a hidden in the "stats" namespace.

    But I really assume the problem comes because you use data(.) in order to use such objects in your package, and on R package building, R may resave the data and ends up losing the important property that environment(fc) must have "stats" as a parent environment.

    > set.seed(7); Fn <- ecdf(rnorm(12))
    > save(Fn, file="/tmp/Fn.rda")
    > rm(Fn)
    > load(file="/tmp/Fn.rda")
    > Fn
    Empirical CDF
    Call: ecdf(rnorm(12))
      x[1:12] = -1.1968, -0.97067, -0.94728,  ..., 2.2872, 2.7168
    > plot(Fn)
    > Fn(1)
    [1] 0.75
    > q()
    
    ... restart R
    
    > (load(file="/tmp/Fn.rda"))
    [1] "Fn"
    > Fn(1)
    [1] 0.75
    
    > parent.env(environment(Fn))
    <environment: namespace:stats>
    

    So everything works with regular save()ing and load()ing of ecdf objects.

    Solution proposal: Don't use data() for storing objects that are used inside your functions. data() is not at all thought for this (but rather for providing illuminating data sets).

    Rather put it into something like /inst/internal/ecdf_lst.rda and get it into your function by something like load(system.file("internal/ecdf_lst.rda", package="<pkg>"))