Suppose I've got a function Seq[String] => Seq[Int]
, e.g. def len(as: Seq[String]): Int = as.map(_.length)
. Now I would like to apply this function to a text file, e.g. transform all the file lines to numbers.
I read a text file as scala.io.Source.fromFile("/tmp/xxx.txt").getLines
that returns an iterator.
I can use toList
or to(LazyList)
to "convert" the iterator to Seq
but then I read the whole file into the memory.
So I need to write another function Iterator[String] => Iterator[Int]
, which is actually a copied version of Seq[String] => Seq[Int]
. Is it correct ? What is the best way to avoid the duplicated code?
If you have an arbitrary function Seq[String] => Seq[Int]
, then
I use toList or to(LazyList) to "convert" the iterator to Seq but in both cases I read the whole file in the memory.
is the best you can do, because the function can start by looking at the end of the Seq[String]
, or its length, etc.
And Scala doesn't let you look "inside" the function and figure out "it's map(something)
, I can just do the same map
for iterators" (there are some caveats with macros, but not really useful here).
So I need to write another function
Iterator[String] => Iterator[Int]
, which is actually a copied version ofSeq[String] => Seq[Int]
. Is it correct ? What is the best way to avoid the duplicated code?
If you control the definition of the function, you can use higher-kinded types to define a function which works for both cases. E.g. in Scala 2.13
def len[C[A] <: IterableOnceOps[A, C, C[A]]](as: C[String]): C[Int] = as.map(_.length)
val x: Seq[Int] = len(Seq("a", "b"))
val y: Iterator[Int] = len(Iterator("a", "b"))