Search code examples
scala

Generating all possible combinations from a List[List[Int]] in Scala


Given the following list:

List(List(1,2,3), List(4,5))

I would like to generate all the possible combinations. Using yield, it can be done as follows:

scala> for (x <- l.head; y <- l.last) yield (x,y)
res17: List[(Int, Int)] = List((1,4), (1,5), (2,4), (2,5), (3,4), (3,5))

But the problem I have is that the List[List[Int]] is not fixed; it can grow and shrink in size, so I never know how many for loops I will need in advance. What I would like is to be able to pass that list into a function which will dynamically generate the combinations regardless of the number of lists I have, so:

def generator (x : List[List[Int]) : List[List[Int]]

Is there a built-in library function that can do this. If not how do I go about doing this. Any pointers and hints would be great.

UPDATE:

The answer by @DNA blows the heap with the following (not so big) nested List structure:

List(

    List(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300), 
    List(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300), 
    List(0, 20, 40, 60, 80, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280, 300), 
    List(0, 50, 100, 150, 200, 250, 300), 
    List(0, 100, 200, 300), 
    List(0, 200), 
    List(0)

    )

Calling the generator2 function as follows:

generator2(
      List(
        List(0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185, 190, 195, 200, 205, 210, 215, 220, 225, 230, 235, 240, 245, 250, 255, 260, 265, 270, 275, 280, 285, 290, 295, 300),
        List(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300),
        List(0, 20, 40, 60, 80, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280, 300),
        List(0, 50, 100, 150, 200, 250, 300),
        List(0, 100, 200, 300),
        List(0, 200),
        List(0)
      )
    )

Is there a way to generate the cartesian product without blowing the heap?

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at scala.LowPriorityImplicits.wrapRefArray(LowPriorityImplicits.scala:73)
    at recfun.Main$.recfun$Main$$generator$1(Main.scala:82)
    at recfun.Main$$anonfun$recfun$Main$$generator$1$1.apply(Main.scala:83)
    at recfun.Main$$anonfun$recfun$Main$$generator$1$1.apply(Main.scala:83)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
    at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
    at recfun.Main$.recfun$Main$$generator$1(Main.scala:83)
    at recfun.Main$$anonfun$recfun$Main$$generator$1$1.apply(Main.scala:83)
    at recfun.Main$$anonfun$recfun$Main$$generator$1$1.apply(Main.scala:83)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
    at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
    at recfun.Main$.recfun$Main$$generator$1(Main.scala:83)
    at recfun.Main$$anonfun$recfun$Main$$generator$1$1.apply(Main.scala:83)
    at recfun.Main$$anonfun$recfun$Main$$generator$1$1.apply(Main.scala:83)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
    at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
    at recfun.Main$.recfun$Main$$generator$1(Main.scala:83)
    at recfun.Main$$anonfun$recfun$Main$$generator$1$1.apply(Main.scala:83)
    at recfun.Main$$anonfun$recfun$Main$$generator$1$1.apply(Main.scala:83)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)

Solution

  • Here's a recursive solution:

      def generator(x: List[List[Int]]): List[List[Int]] = x match {
        case Nil    => List(Nil)
        case h :: _ => h.flatMap(i => generator(x.tail).map(i :: _))
      }
    

    which produces:

    val a = List(List(1, 2, 3), List(4, 5))       
    val b = List(List(1, 2, 3), List(4, 5), List(6, 7))
    
    generator(a)    //> List(List(1, 4), List(1, 5), List(2, 4), 
                    //| List(2, 5), List(3, 4), List(3, 5))
    generator(b)    //> List(List(1, 4, 6), List(1, 4, 7), List(1, 5, 6), 
                    //| List(1, 5, 7), List(2, 4, 6), List(2, 4, 7),
                    //| List(2, 5, 6), List(2, 5, 7), Listt(3, 4, 6), 
                    //| List(3, 4, 7), List(3, 5, 6), List(3, 5, 7))
    

    Update: the second case can also be written as a for comprehension, which may be a little clearer:

    def generator2(x: List[List[Int]]): List[List[Int]] = x match {
      case Nil    => List(Nil)
      case h :: t => for (j <- generator2(t); i <- h) yield i :: j
    }
    

    Update 2: for larger datasets, if you run out of memory, you can use Streams instead (if it makes sense to process the results incrementally). For example:

    def generator(x: Stream[Stream[Int]]): Stream[Stream[Int]] = 
      if (x.isEmpty) Stream(Stream.empty) 
      else x.head.flatMap(i => generator(x.tail).map(i #:: _))
    
    // NB pass in the data as Stream of Streams, not List of Lists
    generator(input).take(3).foreach(x => println(x.toList))
    
    >List(0, 0, 0, 0, 0, 0, 0)
    >List(0, 0, 0, 0, 0, 200, 0)
    >List(0, 0, 0, 0, 100, 0, 0)