I need to sequentially group rows of similar (by id) information from about 20 lists of lists. In SQL that would be called a join
.
Towards this end I'd like to pop elements from the head of a sorted list based on id equality, add them to the current row and leave the rest for later processing.
A contrived example for such a pop for a singe list (or table if you wish) will be the following:
import scala.collection.mutable.Stack
val ss = Stack(1,1,2,3,4,5,6,7,8,9)
@annotation.tailrec
def retAll(ss: Stack[Int], x: Int, out: List[Int] = List()): List[Int] = {
if (ss.head == x) {
val h = ss.pop
retAll(ss, x, h +: out)
}
else out
}
println(retAll(ss, 1))
println(ss)
// List(1, 1)
// Stack(2, 3, 4, 5, 6, 7, 8, 9)
which means that next time I'll retrieve 2
from the head of the stack left.
I'm wondering if there exists a more idiomatic or efficient approach to obtain the same result and, what is more important, if this construct can be parallelized later, through Futures
eg, given ss
is mutable?
Could you just use a List
, and use the efficiency list.head
(in my example hidden inside span
) be your pop operation?
For thread-safety as you mutate the reference, you might just wrap it in java.util.concurrent.atomic.AtomicReference
.
import java.util.concurrent.atomic.AtomicReference
val ss = AtomicReference(List(1,1,2,3,4,5,6,7,8,9))
def retAll(stack: AtomicReference[List[Int]], x: Int): List[Int] = {
val ( popped, remainder ) = stack.get.span(_ == x)
if (popped.nonEmpty) stack.set(remainder)
popped
}
println(retAll(ss, 1))
println(ss)
// List(1, 1)
// List(2, 3, 4, 5, 6, 7, 8, 9)
@GaëlJ makes a good point in the comments. Even though the reference is atomic, the get and set are distinct operations, and concurrent threads could interleave them to bad effect.
Probably the surest way to deal with thread safety here is to keep all access to and mutation of the list behind a lock with old-school synchronization. That said, if retAll
is the only operation that will access the list, or if you are careful to do a similar work with respect to all other operations, a more correct version might be:
import java.util.concurrent.atomic.AtomicReference
val ss = AtomicReference(List(1,1,2,3,4,5,6,7,8,9))
def retAll(stack: AtomicReference[List[Int]], x: Int): List[Int] = {
var out : List[Int] = null // var in a single-threaded local context is okay
stack.updateAndGet { list =>
val ( popped, remainder ) = list.span(_ == x)
out = popped
remainder
}
out
}
println(retAll(ss, 1))
println(ss)
// List(1, 1)
// List(2, 3, 4, 5, 6, 7, 8, 9)
It's a bit less elegant than the original-but-not-quite-thread-safe solution.
Here's an old-school sync'ed version:
class QueueManager {
// protected by this' lock
private var ss = List(1,1,2,3,4,5,6,7,8,9)
def current : List[Int] = this.synchronized(ss)
def retAll(x: Int): List[Int] = this.synchronized {
val ( popped, remainder ) = ss.span(_ == x)
this.ss = remainder
popped
}
}
val qm = new QueueManager
qm.synchronized {
println(qm.retAll(1))
println(qm.current)
}
// List(1, 1)
// List(2, 3, 4, 5, 6, 7, 8, 9)