Search code examples
scalaseq

Scala: How to convert a Seq[Array[String]] into Seq[Double]?


I need to split up the data in Seq[Array[String]] type into two Seq[Double] type items.

Sample data : ([4.0|1492168815],[11.0|1491916394],[2.0|1491812028]).

I used var action1, timestamp1 = seq.map(t => (t.split("|"))).flatten.asInstanceOf[Seq[Double]] but didn't get the results as expected. Looking out for valuable suggestions.


Solution

  • Assuming your input is in format "[double1|double2]",

    scala> Seq("[4.0|1492168815]","[11.0|1491916394]","[2.0|1491812028]")
    res72: Seq[String] = List([4.0|1492168815], [11.0|1491916394], [2.0|1491812028])
    

    drop [ and ], then split by \\|, | is a metacharacter in regex.

    scala> res72.flatMap {_.dropRight(1).drop(1).split("\\|").toList}.map{_.toDouble}
    res74: Seq[Double] = List(4.0, 1.492168815E9, 11.0, 1.491916394E9, 2.0, 1.491812028E9)
    

    Or you can do

    scala> val actTime = seq.flatMap(t => t.map(x => { val temp = x.split("\\|"); (temp(0), temp(1))}))
    actTime: Seq[(String, String)] = List((4.0,1492168815), (11.0,1491916394), (2.0,1491812028))
    

    And to separate them into two Seq[Double] you can do

    scala> val action1 = actTime.map(_._1.toDouble)
    action1: Seq[Double] = List(4.0, 11.0, 2.0)
    
    scala> val timestamp1 = actTime.map(_._2.toDouble)
    timestamp1: Seq[Double] = List(1.492168815E9, 1.491916394E9, 1.491812028E9)
    

    If there could be non-double data in input, you should use Try for safer Double conversion,

    scala> Seq("[4.0|1492168815]","[11.0|1491916394]","[2.0|1491812028]", "[abc|abc]")
    res75: Seq[String] = List([4.0|1492168815], [11.0|1491916394], [2.0|1491812028], [abc|abc])
    
    scala> import scala.util.Success
    import scala.util.Success
    
    scala> import scala.util.Try
    import scala.util.Try
    
    scala> res75.flatMap {_.dropRight(1).drop(1).split("\\|").toList}
                .map{d => Try(d.toDouble)}
                .collect {case Success(x) => x }
    res83: Seq[Double] = List(4.0, 1.492168815E9, 11.0, 1.491916394E9, 2.0, 1.491812028E9)