Search code examples
crrcppr-s4

S4 object with a pointer to a C struct


I have a third-party C library I am using to write an R extension. I am required to create a few structs defined in the library (and initialize them) I need to maintain them as part of an S4 object (think of these structs as defining to state of a computation, to destroy them would be to destroy all remaining computation and the results of all that has been already computed). I am thinking of creating a S4 object to hold pointers these structs as void* pointers but it is not at all clear how to do so, what would be the type of the slot?


Solution

  • As pointed out by @hrbrmstr, you can use the externalptr type to keep such objects "alive", which is touched on in this section of Writing R Extensions, although I don't see any reason why you will need to store anything as void*. If you don't have any issue with using a little C++, the Rcpp class XPtr can eliminate a fair amount of the boilerplate involved with managing EXTPTRSXPs. As an example, assume the following simplified example represents your third party library's API:

    #include <Rcpp.h>
    #include <stdlib.h>
    
    typedef struct {
        unsigned int count;
        double total;
    } CStruct;
    
    CStruct* init_CStruct() {
        return (CStruct*)::malloc(sizeof(CStruct));
    }
    
    void free_CStruct(CStruct* ptr) {
        ::free(ptr);
        ::printf("free_CStruct called.\n");
    }
    
    typedef Rcpp::XPtr<CStruct, Rcpp::PreserveStorage, free_CStruct> xptr_t;
    

    When working with pointers created via new it is generally sufficient to use Rcpp::XPtr<SomeClass>, because the default finalizer simply calls delete on the held object. However, since you are dealing with a C API, we have to supply the (default) template parameter Rcpp::PreserveStorage, and more importantly, the appropriate finalizer (free_CStruct in this example) so that the XPtr does not call delete on memory allocated via malloc, etc., when the corresponding R object is garbage collected.

    Continuing with the example, assume you write the following functions to interact with your CStruct:

    // [[Rcpp::export]]
    xptr_t MakeCStruct() {
        CStruct* ptr = init_CStruct();
        ptr->count = 0;
        ptr->total = 0;
    
        return xptr_t(ptr, true);
    }
    
    // [[Rcpp::export]]
    void UpdateCStruct(xptr_t ptr, SEXP x) {
        if (TYPEOF(x) == REALSXP) {
            R_xlen_t i = 0, sz = XLENGTH(x);
            for ( ; i < sz; i++) {
                if (!ISNA(REAL(x)[i])) {
                    ptr->count++;
                    ptr->total += REAL(x)[i];
                }
            }
            return;
        }
    
        if (TYPEOF(x) == INTSXP) {
            R_xlen_t i = 0, sz = XLENGTH(x);
            for ( ; i < sz; i++) {
                if (!ISNA(INTEGER(x)[i])) {
                    ptr->count++;
                    ptr->total += INTEGER(x)[i];
                }
            }
            return;
        }
    
        Rf_warning("Invalid SEXPTYPE.\n");
    }
    
    // [[Rcpp::export]]
    void SummarizeCStruct(xptr_t ptr) {
        ::printf(
            "count: %d\ntotal: %f\naverage: %f\n",
            ptr->count, ptr->total,
            ptr->count > 0 ? ptr->total / ptr->count : 0
        );
    }
    
    // [[Rcpp::export]]
    int GetCStructCount(xptr_t ptr) {
        return ptr->count;
    }
    
    // [[Rcpp::export]]
    double GetCStructTotal(xptr_t ptr) {
        return ptr->total;
    }
    
    // [[Rcpp::export]]
    void ResetCStruct(xptr_t ptr) {
        ptr->count = 0;
        ptr->total = 0.0;
    }
    

    At this point, you have done enough to start handling CStructs from R:

    • ptr <- MakeCStruct() will initialize a CStruct and store it as an externalptr in R
    • UpdateCStruct(ptr, x) will modify the data stored in the CStruct, SummarizeCStruct(ptr) will print a summary, etc.
    • rm(ptr); gc() will remove the ptr object and force the garbage collector to run, thus calling free_CStruct(ptr) and destroying the object on the C side of things as well

    You mentioned the use of S4 classes, which is one option for containing all of these functions in a single place. Here's one possibility:

    setClass(
        "CStruct",
        slots = c(
            ptr = "externalptr",
            update = "function",
            summarize = "function",
            get_count = "function",
            get_total = "function",
            reset = "function"
        )
    )
    
    setMethod(
        "initialize",
        "CStruct",
        function(.Object) {
            .Object@ptr <- MakeCStruct()
            .Object@update <- function(x) {
                UpdateCStruct(.Object@ptr, x)
            }
            .Object@summarize <- function() {
                SummarizeCStruct(.Object@ptr)
            }
            .Object@get_count <- function() {
                GetCStructCount(.Object@ptr)
            }
            .Object@get_total <- function() {
                GetCStructTotal(.Object@ptr)
            }
            .Object@reset <- function() {
                ResetCStruct(.Object@ptr)
            }
            .Object
        }
    ) 
    

    Then, we can work with the CStructs like this:

    ptr <- new("CStruct")
    ptr@summarize()
    # count: 0
    # total: 0.000000
    # average: 0.000000
    
    set.seed(123)
    ptr@update(rnorm(100))
    ptr@summarize()
    # count: 100
    # total: 9.040591
    # average: 0.090406
    
    ptr@update(rnorm(100))
    ptr@summarize()
    # count: 200
    # total: -1.714089
    # average: -0.008570
    
    ptr@reset()
    ptr@summarize()
    # count: 0
    # total: 0.000000
    # average: 0.000000
    
    rm(ptr); gc()
    # free_CStruct called.
    #          used (Mb) gc trigger (Mb) max used (Mb)
    # Ncells 484713 25.9     940480 50.3   601634 32.2
    # Vcells 934299  7.2    1650153 12.6  1308457 10.0
    

    Of course, another option is to use Rcpp Modules, which more or less take care of the class definition boilerplate on the R side (using reference classes rather than S4 classes, however).