Search code examples
scalaout-of-memoryscoping

Scope & memory issues in Scala


I have a very large List of numbers, which undergo lots of math manipulation. I only care about the final result. To simulate this behavior, see my example code below:

object X { 
def main(args:Array[String]) = {
    val N = 10000000
    val x = List(1 to N).flatten
    println(x.slice(0,10))
    Thread.sleep( 5000)
    val y = x.map(_*5)
    println(y.slice(0,10))
    Thread.sleep( 5000)
    val z = y.map( _+4)
    println(z.slice(0,10))
    Thread.sleep( 5000)
}
     }

So x is a very large list. I care only about the result z. To obtain z, I first have to mathematically manipulate x to get y. Then I manipulate y to get z. ( I cannot go from x to z in one step, because the manipulations are quite complicated. This is just an example. )

So when I run this example, I run out of memory presumably because x, y and z are all in scope and they all occupy memory.

So I try the following:

def main(args:Array[String]) = {
    val N = 10000000
    val z = {
            val y = {
                val x = List(1 to N).flatten
                println(x.slice(0,10))
                Thread.sleep( 5000)
                x

            }.map(_*5)

            println(y.slice(0,10))
            Thread.sleep( 5000)
            y

    }.map( _+4)
    println(z.slice(0,10))
    Thread.sleep(5000)
}

So now only z is in scope. So presumably x and y are created and then garbage collected when they go out of scope. But this isn't what happens. Instead, I again run out of memory!

( Note: I am using java -Xincgc, but it doesn't help )

Question: When I have adequate memory for only 1 large list, can I somehow manipulate it using only val's ( ie. no mutable vars or ListBuffers ), maybe using scoping to force gc ? If so, how ? Thanks


Solution

  • Have you tried something like this?

    val N = 10000000
    val x = List(1 to N).flatten.view // get a view
    val y = x.map(_ * 5)
    val z = y.map(_ + 4)
    println(z.force.slice(0, 10))
    

    It should help avoiding creating the intermediate full structure for y and z.