I encountered some unauthorized strangeness working with Scala's SortedMap[A,B]. If I declare the reference to SortedMap[A,B] "a" to be of type Map[A,B], then map operations on "a" will produce a non-sorted map implementation.
Example:
import scala.collection.immutable._
object Test extends App {
val a: Map[String, String] = SortedMap[String, String]("a" -> "s", "b" -> "t", "c" -> "u", "d" -> "v", "e" -> "w", "f" -> "x")
println(a.getClass+": "+a)
val b = a map {x => x} // identity
println(b.getClass+": "+b)
}
The output of the above is:
class scala.collection.immutable.TreeMap: Map(a -> s, b -> t, c -> u, d -> v, e -> w, f -> x)
class scala.collection.immutable.HashMap$HashTrieMap: Map(e -> w, f -> x, a -> s, b -> t, c -> u, d -> v)
The order of key/value pairs before and after the identity transformation is not the same.
The strange thing is that removing the type declaration from "a" makes this issue go away. That's fine in a toy example, but makes SortedMap[A,B] unusable for passing to methods that expect Map[A,B] parameters.
In general, I would expect higher order functions such as "map" and "filter" to not change the fundamental properties of the collections they are applied to.
Does anyone know why "map" is behaving like this?
The map
method, like most of the collection methods, isn't defined specifically for SortedMap
. It is defined on a higher-level class (TraversableLike) and uses a "builder" to turn the mapped result into the correct return type.
So how does it decide what the "correct" return type is? Well, it tries to give you back the return type that it started out as. When you tell Scala that you have a Map[String,String]
and ask it to map
, then the builder has to figure out how to "build" the type for returning. Since you told Scala that the input was a Map[String,String]
, the builder decides to build a Map[String,String]
for you. The builder doesn't know that you wanted a SortedMap
, so it doesn't give you one.
The reason it works when you leave off the the Map[String,String]
type annotation is that Scala infers that the type of a
is SortedMap[String,String]
. Thus, when you call map
, you are calling it on a SortedMap
, and the builder knows to construct a SortedMap
for returning.
As far as your assertion that methods shouldn't change "fundamental properties", I think you're looking at it from the wrong angle. The methods will always give you back an object that conforms to the type that you specify. It's the type that defines the behavior of the builder, not the underlying implementation. When you think about like that, it's the type that forms the contract for how methods should behave.
Why is this the preferred behavior? Let's look at a concrete example. Say we have a SortedMap[Int,String]
val sortedMap = SortedMap[Int, String](1 -> "s", 2 -> "t", 3 -> "u", 4 -> "v")
If I were to map
over it with a function that modifies the keys, I run the risk of losing elements when their keys clash:
scala> sortedMap.map { case (k, v) => (k / 2, v) }
res3: SortedMap[Int,String] = Map(0 -> s, 1 -> u, 2 -> v)
But hey, that's fine. It's a Map
after all, and I know it's a Map
, so I should expect that behavior.
Now let's say we have a function that accepts an Iterable
of pairs:
def f(iterable: Iterable[(Int, String)]) =
iterable.map { case (k, v) => (k / 2, v) }
Since this function has nothing to do with Map
s, it would be very surprising if the result of this function ever had fewer elements than the input. After all, map
on a Iterable
should produce the mapped version of each element. But a Map
is an Iterable
of pairs, so we can pass it into this function. So what happens in Scala when we do?
scala> f(sortedMap)
res4: Iterable[(Int, String)] = List((0,s), (1,t), (1,u), (2,v))
Look at that! No elements lost! In other words, Scala won't surprise us by violating our expectations about how map
on an Iterable
should work. If the builder instead tried to produce a SortedMap
based on the fact that the input was a SortedMap
, then our function f
would have surprising results, and this would be bad.
So the moral of the story is: Use the types to tell the collections framework how to deal with your data. If you want your code to be able to expect that a map is sorted, then you should type it as SortedMap
.