Search code examples
c++rrcpp

Is replacing element of Rcpp::List inside an Rcpp function memory-safe?


I need to overwrite an element of an Rcpp::List object passed as parameter to an Rcpp function. My concern is memory safety. Is it the case that by reassigning a non-empty element of the list, I am effectively rewiring a pointer to the original content, yet never deallocating the memory which stores the original content? If it is, how does one solve this?

I am aware that I can easily modify an Rcpp object (eg. Rcpp::NumericVector) that is an element of an Rcpp::List, since Rcpp::NumericVector makes a shallow copy. This does not satisfy my requirement, however, which is to replace the element entirely by something else.

Below, I include a C++ code snippet which shows the scenario to which I am referring.

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
void replaceListElement(List l)
{
    std::vector<int> v;
    v.push_back(4);
    v.push_back(5);
    v.push_back(6);
    l["a"] = v;
}

/*** R
l <- list()
l$a <- c(1,2,3)
replaceListElement(l)
print(l)
*/

When sourced via Rcpp in RStudio, the print(l) command outputs the following

$a
[1] 4 5 6

which is the desired result, so my question pertains only to memory safety.


Solution

  • A Rcpp::List is a Vector<VECSXP>, i.e. a vector of pointers to other vectors. If you assign a new vector to some element in this list, you are indeed just changing a pointer without freeing the memory that the pointer used to point to. However, R still knows about this memory and frees it through its garbage collector. We can see this in action with a simple experiment, in which I use your C++ code with a slight change in the R code:

    #include <Rcpp.h>
    using namespace Rcpp;
    
    // [[Rcpp::export]]
    void replaceListElement(List l)
    {
      std::vector<int> v;
      v.push_back(4);
      v.push_back(5);
      v.push_back(6);
      l["a"] = v;
    }
    
    /*** R
    l <- list()
    l$a <- runif(1e7)
    replaceListElement(l)
    print(l)
    gc() # optional
    */
    

    Here a larger vector is used to make the effect more prominent. If I now use R -d valgrind -e 'Rcpp::sourceCpp("<filename>")' I get the following result with the gc() call

    ==13827==
    ==13827== HEAP SUMMARY:
    ==13827==     in use at exit: 48,125,775 bytes in 9,425 blocks
    ==13827==   total heap usage: 34,139 allocs, 24,714 frees, 173,261,724 bytes allocated
    ==13827==
    ==13827== LEAK SUMMARY:
    ==13827==    definitely lost: 0 bytes in 0 blocks
    ==13827==    indirectly lost: 0 bytes in 0 blocks
    ==13827==      possibly lost: 0 bytes in 0 blocks
    ==13827==    still reachable: 48,125,775 bytes in 9,425 blocks
    ==13827==                       of which reachable via heuristic:
    ==13827==                         newarray           : 4,264 bytes in 1 blocks
    ==13827==         suppressed: 0 bytes in 0 blocks
    ==13827== Rerun with --leak-check=full to see details of leaked memory
    ==13827==
    ==13827== For counts of detected and suppressed errors, rerun with: -v
    ==13827== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
    

    And without the gc() call:

    ==13761==
    ==13761== HEAP SUMMARY:
    ==13761==     in use at exit: 132,713,314 bytes in 10,009 blocks
    ==13761==   total heap usage: 34,086 allocs, 24,077 frees, 173,212,886 bytes allocated
    ==13761==
    ==13761== LEAK SUMMARY:
    ==13761==    definitely lost: 0 bytes in 0 blocks
    ==13761==    indirectly lost: 0 bytes in 0 blocks
    ==13761==      possibly lost: 0 bytes in 0 blocks
    ==13761==    still reachable: 132,713,314 bytes in 10,009 blocks
    ==13761==                       of which reachable via heuristic:
    ==13761==                         newarray           : 4,264 bytes in 1 blocks
    ==13761==         suppressed: 0 bytes in 0 blocks
    ==13761== Rerun with --leak-check=full to see details of leaked memory
    ==13761==
    ==13761== For counts of detected and suppressed errors, rerun with: -v
    ==13761== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
    

    So in both cases valgrind does not detect any memory leak. The amount of still reachable memory differs by about 8x10^7 bytes, i.e. the size of the original vector in l$a. This demonstrates that R indeed knows about the original vector and frees it when it is told to do so, but this would also happen when R decides by itself to run the garbage collector.