Search code examples
arraysscalacase-class

Scala: Parsing Array of String to a case class


I have created a case class like this:

def case_class(): Unit = {
   case class StockPrice(quarter : Byte,
                      stock : String,
                      date : String,
                      open : Double,
                      high : Double,
                      low : Double,
                      close : Double,
                      volume : Double,
                      percent_change_price : Double,
                      percent_change_volume_over_last_wk : Double,
                      previous_weeks_volume : Double,
                      next_weeks_open : Double,
                      next_weeks_close : Double,
                      percent_change_next_weeks_price : Double,
                      days_to_next_dividend : Double,
                      percent_return_next_dividend : Double
                     )

And I have thousands of line as Array of String like this:

1,AA,1/7/2011,$15.82,$16.72,$15.78,$16.42,239655616,3.79267,,,$16.71,$15.97,-4.42849,26,0.182704

1,AA,1/14/2011,$16.71,$16.71,$15.64,$15.97,242963398,-4.42849,1.380223028,239655616,$16.19,$15.79,-2.47066,19,0.187852

1,AA,1/21/2011,$16.19,$16.38,$15.60,$15.79,138428495,-2.47066,-43.02495926,242963398,$15.87,$16.13,1.63831,12,0.189994

1,AA,1/28/2011,$15.87,$16.63,$15.82,$16.13,151379173,1.63831,9.355500109,138428495,$16.18,$17.14,5.93325,5,0.185989

How Can I parse data from Array into that case class? Thank you for your help!


Solution

  • You can proceed as below (I've taken simplified example)

    Given your case class and data (lines)

    // Your case-class
    case class MyCaseClass(
      fieldByte: Byte,
      fieldString: String,
      fieldDouble: Double
    )
    
    // input data
    val lines: List[String] = List(
      "1,AA,$1.1",
      "2,BB,$2.2",
      "3,CC,$3.3"
    )
    

    Note: you can read lines from a text file as

    val lines = Source.fromFile("my_file.txt").getLines.toList
    

    You can have some utility methods for mapping (cleaning & parsing)

    // remove '$' symbols from string
    def removeDollars(line: String): String = line.replaceAll("\\$", "")
    
    // split string into tokens and
    // convert into MyCaseClass object
    def parseLine(line: String): MyCaseClass = {
      val tokens: Seq[String] = line.split(",")
      MyCaseClass(
        fieldByte = tokens(0).toByte,
        fieldString = tokens(1),
        fieldDouble = tokens(2).toDouble
      )
    }
    

    And then use them to convert strings into case-class objects

    // conversion
    val myCaseClassObjects: Seq[MyCaseClass] = lines.map(removeDollars).map(parseLine)
    

    As a more advanced (and generalized) approach, you can generate the mapping (parsing) function for converting tokens into fields of your case-class using something like reflection, as told here