Search code examples
scalaloopsapache-sparkseq

Iterating through Seq[row] till a particular condition is met using Scala


I need to iterate a scala Seq of Row type until a particular condition is met. i dont need to process further post the condition.

I have a seq[Row] r->WrappedArray([1/1/2020,abc,1],[1/2/2020,pqr,1],[1/3/2020,stu,0],[1/4/2020,opq,1],[1/6/2020,lmn,0])

I want to iterate through this collection for r.getInt(2) until i encounter 0. As soon as i encounter 0, i need to break the iteration and collect r.getString(1) till then. I dont need to look into any other data post that.

My output should be: Array(abc,pqr,stu)

I am new to scala programming. This seq was actually a Dataframe. I know how to handle this using Spark dataframes, but due to some restriction put forth by my organization, windows function, createDataFrame function are not available/working in our environment. Hence i have resort to Scala programming to achieve the same.

All I could come up was something like below, but not really working!

breakable{
for(i <- r)
var temp = i.getInt(3)===0
if(temp ==true)
{
val = i.getInt(2)
break()
}
}

Can someone please help me here!


Solution

  • You can use the takeWhile method to grab the elements while it's value is 1

    s.takeWhile(_.getInt(2) == 1).map(_.getString(1))
    

    Than will give you

    List(abc, pqr)
    

    So you still need to get the first element where the int values 0 which you can do as follows:

    s.find(_.getInt(2)== 0).map(_.getString(1)).get
    

    Putting all together (and handle possible nil values):

    s.takeWhile(_.getInt(2) == 1).map(_.getString(1)) ++ s.find(_.getInt(2)== 0).map(r => List(r.getString(1))).getOrElse(Nil)
    

    Result:

    Seq[String] = List(abc, pqr, stu)