Search code examples
scalascala-collectionsfor-comprehension

Why for-loop with yield accumulates into a Map instead of a List?


I have the following code:

val dummy = Map(1 -> Map(2 -> 3.0,
                         4 -> 5.0),
                6 -> Map(7 -> 8.0))

val thisIsList = for (x <- dummy; y <- x._2.keys) yield s"(${x._1}, ${y})"
println(thisIsList)  // List((1, 2), (1, 4), (6, 7))

val thisIsMap = for (x <- dummy; y <- x._2.keys) yield new Tuple2(x._1, y)
println(thisIsMap)   // Map(1 -> 4, 6 -> 7) - this is not what I want    

I would expect the second statement to produce a list of tuples, but instead it returns a Map. I found an explanation here scala: yield a sequence of tuples instead of map on why Map is returned, but I'm still struggling to find an elegant way to return a list of tuples instead in this case.


Solution

  • This is because of how the for comprehension syntax is transformed by the compiler into a series of method calls. map, flatMap, and withFilter are targeted by the permutations of for comprehensions. This is very powerful and general because it allows the syntax to work with arbitrary types. There's more to this, such as the CanBuildFrom implicit, but essentially mapping a Map to an Iterable[Tuple[A, B]] produces a Map[A, B]. The signature is actually overloaded for Map to provide this behavior

    Specifically, given your original code below

    val thisIsMap = for (x <- dummy; y <- x._2.keys) yield new Tuple2(x._1, y)
    println(thisIsMap)   // Map(1 -> 4, 6 -> 7) - this is not what I want
    

    The translation looks roughly like this

    val thisIsMap = dummy.flatMap { x =>
      x._2.keys.map { y =>
        (x._1, y)
      }
    }
    

    See this fiddle

    In order to obtain a list as desired, we can write

    val thisIsMap = (for (x <- dummy; y <- x._2.keys) yield (x._1, y)).toList
    

    However, if we consider what we've learned about for comprehensions, we can write it more elegantly as

    val thisIsMap = for (x <- dummy.toList; y <- x._2.keys) yield (x._1, y)
    

    In the above, we have leveraged the very behavior that confounded the original code by inferring that a for comprehension over a List will produce a List.

    However, note the difference between converting the source into a List as opposed to converting the resulting map into a List after the comprehension.

    If we call toList on the source (dummy) we get List((1,2), (1,4), (6,7)) while if we call it on the result, we get List((1,4), (6,7)), for self evident reasons so choose carefully and deliberately.