Search code examples
c++rspatstat

Convert spatstat functions to C++ to circumvent memory-limitation


I am using spatstat to estimate the risk of pest introduction and spread from roads, highways, and other roadways. However, I believe I am running into memory-limitation issues; my data is at a continental scale and my computer only has 16 GB of memory. The warning message I receive when running spatstat's as.owin() and density.psp() functions is:

Error: cannot allocate vector of size X.X. Gb

Some colleagues of mine have suggested I might be able to lessen the memory-burden by converting the spatstat functions as.owin() and density.psp() to execute via C++ with the rcpp package. This technique is well outside of my comfort zone and I was hoping to get a sense from StackOverflow on whether or not it's even feasible before I dedicate many hours to it.

Specifically, my questions are:

  1. Has anyone converted spatstat functions to C++?
  2. How have other spatstat users worked around memory-limitation issues?

Any help and guidance would be greatly appreciated.

Many thanks,

Josh


Solution

  • Firstly I strongly agree with the poster who said that the quick fix is not to edit the code but to throw more computing resources at the existing code. It can be fiddly to use a cloud computing service, but it will take much more time to re-implement, test, and validate a completely new source code.

    But anyway:

    The first thing to check is whether the pixel images that you want to create are too big to store in R memory at all. Try just creating Z <- as.im(R, dimyx=d) where R is a rectangle containing your spatial domain and d is the dimensions (rows, columns) of the desired image. If that fails with a message about memory limits, then you're going to need a bigger boat -- I mean, computer.

    The function density.psp has options method="FFT" (the default) and method="C". Have you tried both of these? The FFT method uses more memory because it does the whole calculation in one enormous Fourier transform (after expanding the domain to several times its original size). The C method is a loop over all pixels and all segments; it is slower, but requires relatively little memory, apart from the storage for the output raster data. If method="C" fails because of insufficient memory, this would again suggest that the raster images you're trying to create are too large to store in R memory.

    The function as.owin is generic, with 28 methods. Which method is giving you trouble? What data are you converting to an owin?

    spatstat is already written in a mix of R, C and C++. We are constantly looking for ways to accelerate the code and reduce the memory demand. If you have identified a particular case where the code is slow, we would like to know the details. If you do spot a way to fix or accelerate some of the code, please share it.