I don't know closures in Kotlin so I want to understand this aspect. I have a class Foo that holds a list bar, technically immutable. But if I need to change it I create a new list and replace the value of the class member Foo.bar.
Then I create a Unit that will loop through that list, so I send Foo.bar as a closure, but it processes over a long time (with "sleeps" in between), therefore there is plenty of time for the contents of Foo.bar to change.
How do I ensure the traversal goes without changes (print 1 to 10), is the closure ensuring that already? do I need to make a copy of the list (element by element not just a reference to the list) before sending it to the closure?
class Foo(var bar: List<Int>)
suspend fun <T> Collection<T>.traverseOverTime(block: (T) -> Unit) {
this.iterator().traverseOverTime(block)
}
suspend fun <T> Iterator<T>.traverseOverTime(block: (T) -> Unit) {
repeat(2) {
if (!hasNext()) return
block(next())
}
delay(1000)
traverseOverTime(block)
}
fun main() = runBlocking {
val foo = Foo(listOf(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
launch {
foo.bar.traverseOverTime { println(it.toString()) }
}
// case 1.: I hear this changes the list element regardless of immutable status
delay(500)
try {
(foo.bar as MutableList<Int>)[2] = 1000
println("case 1.: ${foo.bar}")
} catch (e: Exception) { println("case 1.: $e") }
// case 2.: so would this
delay(1000)
try {
(foo.bar as MutableList<Int>).removeAt(3)
println("case 2.: ${foo.bar}")
} catch (e: Exception) { println("case 2.: $e") }
// case 3.: and this fully changes foo.bar
delay(1000)
foo.bar = listOf(100)
println("case 3.: ${foo.bar}")
}
This outputs the following:
0
1
case 1.: [0, 1, 1000, 3, 4, 5, 6, 7, 8, 9, 10]
1000
3
case 2.: java.lang.UnsupportedOperationException
4
5
case 3.: [100]
6
7
8
9
10
It is not that easy to answer the question as it touches multiple topics. Also, I won't discuss the specific examples in the question, I'll rather speak about concurrent modification in general. I'll focus on 4 subtopics:
1. Modify a reference to a collection vs modify contents of a collection.
First of all, we need to understand there is a huge difference between:
foo.bar = listOf(100)
And:
foo.bar[2] = 1000
foo.bar.removeAt(3)
First example updates the foo
object by replacing a reference to a list into a reference to an entirely different list. There are two lists and we switch between them. Second example modifies the contents of an existing list stored in foo.bar
. First example modifies the foo
and doesn't change contents of any lists. Second example doesn't modify the foo
, but modifies contents of a list.
2. Modify a collection and iterate over it concurrently.
This difference is important, because if we for example started iterating over foo.bar
, no matter if using foo.bar.forEach
or for (x in foo.bar)
, then changing foo.bar
to another list doesn't affect us at all. We already iterate over the list, we keep a reference to it, and we don't observe the foo.bar
property anymore.
It is much different if we modify the list contents from another thread, while we iterate over it. This is generally a not good idea, in most cases we will get the ConcurrentModificationException
exception.
3. Closures.
As a general rule, closures store copies of references. But this is still more complicated than it sounds and we could have many different scenarios here. For example:
launch {
foo.bar.forEach { ... }
}
It stores a reference to foo
. When the lambda is executed, it acquires its bar
from foo
and starts iterating it. If we first create the lambda, then we replace foo.bar
with another list and then we execute the lambda, it will see the new list.
val bar = foo.bar
launch {
bar.forEach { ... }
}
This is different. The lambda stores a reference to bar
directly, so if we modify the list stored in foo.bar
, it will still use the old one. This example may seem pretty "artificial", but imagine you have any kind of utility function which accepts a list and we pass foo.bar
to it - it doesn't even know about the foo
, it knows the current value of foo.bar
only.
var bar = foo.bar
launch {
bar.forEach { ... }
}
bar = listOf(100)
It stores a reference to bar
, but bar
is mutable and the closure will see we replaced it with entirely different list.
foo.bar.forEach {
println(it)
}
This example references the first version of the question, where it discussed closures, but the only closure was like the above. In this case the closure doesn't know anything about the list itself. It only receives subsequent items and this is the only thing it knows about.
4. Ways to fix the problem.
Generally speaking, problem of modifying a list contents and a reference to a list are rather orthogonal. If we already started iterating over a collection, it doesn't matter we will modify its reference. Modifying its contents will crash. On the other hand, closures care about references, but they don't really care about list contents. What matters is the point in time when we start iterating - this is when we "stick" to a specific collection.
There is no simple answer to how to fix the problem of closure observing different collection than we needed. It really depends on a specific case. I hope examples from above already explained how closures observe references and how to use them to observe exactly we need.
There are multiple possible ways to fix the problem of modifying the collection we iterate over. Again, it depends on a specific case. We can use mutexes / synchronized blocks to disallow concurrent access to collections. We can create a copy before starting iterating, but that also requires a mutex, because creating a copy is itself iterating, so we can't modify while copying.
Another solution, which is encouraged in languages like Kotlin, is to prefer immutable/read-only collections over mutable ones. Instead of modifying the list in-place, we can assume the list itself is not mutable. To modify it, we actually create its copy (this is safe, because nobody can modify it while we copy it), then we apply some modifications and replace the foo.bar
entirely. This way we can iterate over the list at any time and this is safe. Remember that when we replace foo.bar
, all existing iterating/copying operations will still use the old list, so we won't interfere with them. One caveat is that if we run 2 modification operations concurrently and the second one should see results of the first one, we still need some kind of synchronization.