I need to overwrite an element of an Rcpp::List
object passed as parameter to an Rcpp function. My concern is memory safety. Is it the case that by reassigning a non-empty element of the list, I am effectively rewiring a pointer to the original content, yet never deallocating the memory which stores the original content? If it is, how does one solve this?
I am aware that I can easily modify an Rcpp object (eg. Rcpp::NumericVector
) that is an element of an Rcpp::List
, since Rcpp::NumericVector
makes a shallow copy. This does not satisfy my requirement, however, which is to replace the element entirely by something else.
Below, I include a C++ code snippet which shows the scenario to which I am referring.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
void replaceListElement(List l)
{
std::vector<int> v;
v.push_back(4);
v.push_back(5);
v.push_back(6);
l["a"] = v;
}
/*** R
l <- list()
l$a <- c(1,2,3)
replaceListElement(l)
print(l)
*/
When sourced via Rcpp in RStudio, the print(l)
command outputs the following
$a
[1] 4 5 6
which is the desired result, so my question pertains only to memory safety.
A Rcpp::List
is a Vector<VECSXP>
, i.e. a vector of pointers to other vectors. If you assign a new vector to some element in this list, you are indeed just changing a pointer without freeing the memory that the pointer used to point to. However, R still knows about this memory and frees it through its garbage collector. We can see this in action with a simple experiment, in which I use your C++ code with a slight change in the R code:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
void replaceListElement(List l)
{
std::vector<int> v;
v.push_back(4);
v.push_back(5);
v.push_back(6);
l["a"] = v;
}
/*** R
l <- list()
l$a <- runif(1e7)
replaceListElement(l)
print(l)
gc() # optional
*/
Here a larger vector is used to make the effect more prominent. If I now use R -d valgrind -e 'Rcpp::sourceCpp("<filename>")'
I get the following result with the gc()
call
==13827==
==13827== HEAP SUMMARY:
==13827== in use at exit: 48,125,775 bytes in 9,425 blocks
==13827== total heap usage: 34,139 allocs, 24,714 frees, 173,261,724 bytes allocated
==13827==
==13827== LEAK SUMMARY:
==13827== definitely lost: 0 bytes in 0 blocks
==13827== indirectly lost: 0 bytes in 0 blocks
==13827== possibly lost: 0 bytes in 0 blocks
==13827== still reachable: 48,125,775 bytes in 9,425 blocks
==13827== of which reachable via heuristic:
==13827== newarray : 4,264 bytes in 1 blocks
==13827== suppressed: 0 bytes in 0 blocks
==13827== Rerun with --leak-check=full to see details of leaked memory
==13827==
==13827== For counts of detected and suppressed errors, rerun with: -v
==13827== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
And without the gc()
call:
==13761==
==13761== HEAP SUMMARY:
==13761== in use at exit: 132,713,314 bytes in 10,009 blocks
==13761== total heap usage: 34,086 allocs, 24,077 frees, 173,212,886 bytes allocated
==13761==
==13761== LEAK SUMMARY:
==13761== definitely lost: 0 bytes in 0 blocks
==13761== indirectly lost: 0 bytes in 0 blocks
==13761== possibly lost: 0 bytes in 0 blocks
==13761== still reachable: 132,713,314 bytes in 10,009 blocks
==13761== of which reachable via heuristic:
==13761== newarray : 4,264 bytes in 1 blocks
==13761== suppressed: 0 bytes in 0 blocks
==13761== Rerun with --leak-check=full to see details of leaked memory
==13761==
==13761== For counts of detected and suppressed errors, rerun with: -v
==13761== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
So in both cases valgrind
does not detect any memory leak. The amount of still reachable memory differs by about 8x10^7 bytes, i.e. the size of the original vector in l$a
. This demonstrates that R indeed knows about the original vector and frees it when it is told to do so, but this would also happen when R decides by itself to run the garbage collector.