I have a closure in a function like this:
func permutate(ch chan []int, numbers []int, r int) {
// ... see the full program below
perm := make([]int, r, r)
nextPerm := func() []int {
for i, ind := range indices[:r] {
perm[i] = numbers[ind]
}
return perm
}
// later writing to ch in two places:
// ch <- nextPerm()
// ...
}
This works differently when I allocate the perm
variable inside the closure:
func permutate(ch chan []int, numbers []int, r int) {
// ...
nextPerm := func() []int {
perm := make([]int, r, r)
for i, ind := range indices[:r] {
perm[i] = numbers[ind]
}
return perm
}
// ...
}
I don't understand why. What's the difference between the two variations?
I only run permutate
in exactly one goroutine, so writing to the channel should happen in a serial manner, so no two goroutines should modify the perm
variable at once.
I tried to debug what is happening, but I guess it's a Heisenbug because during the debugging, the race condition will not happen, so I guess it has something to do with the scheduling of goroutines.
Here is the full program (with the global perm
variable):
package main
import (
"errors"
"fmt"
)
func IterPermutations(numbers []int, r int) <-chan []int {
if r > len(numbers) {
err := errors.New("r cannot be bigger than the length of numbers")
panic(err)
}
ch := make(chan []int)
go func() {
defer close(ch)
permutate(ch, numbers, r)
}()
return ch
}
// an implementation similar to Python standard library itertools.permutations:
// https://docs.python.org/3.8/library/itertools.html#itertools.permutations
func permutate(ch chan []int, numbers []int, r int) {
n := len(numbers)
if r < 0 {
r = n
}
indices := make([]int, n, n)
for i := 0; i < n; i++ {
indices[i] = i
}
cycles := make([]int, r, r)
for i := 0; i < r; i++ {
cycles[i] = n - i
}
perm := make([]int, r, r)
nextPerm := func() []int {
for i, ind := range indices[:r] {
perm[i] = numbers[ind]
}
return perm
}
ch <- nextPerm()
if n < 2 {
return
}
var tmp []int
var j int
for i := r - 1; i > -1; i-- {
cycles[i] -= 1
if cycles[i] == 0 {
tmp = append(indices[i+1:], indices[i])
indices = append(indices[:i], tmp...)
cycles[i] = n - i
} else {
j = len(indices) - cycles[i]
indices[i], indices[j] = indices[j], indices[i]
ch <- nextPerm()
i = r // start over the cycle
// i-- will apply, so i will be r-1 at the start of the next cycle
}
}
}
func main() {
for perm := range IterPermutations(phaseSettings, 3) {
fmt.Println(perm)
}
}
This is a data race. When you declare perm
outside of the closure, the closure re-uses perm
everytime it is called and modifies it.
After main goroutine receives the slice over the channel, the permutate
goroutine can keep running and calls the next nextPerm()
- which modifies the slice, as explained. This may or may not happen before the main goroutine use it (or even happen in middle of something), which is a data race. So fmt.Println(perm)
may print the next iter of permutation or the right one (or in rare case, a mix-up of two).
When you declare perm
inside of the closure, it is a new variable and has new allocation of underlying arrays each time the closure is called. So nothing is shared, and no data is raced on.
Note: The race detector of Go might be unable to detect a data race everytime - since data race may simply not happen everytime. To read more on the race detecter, see https://blog.golang.org/race-detector and https://github.com/google/sanitizers/wiki/ThreadSanitizerAlgorithm .