I'm trying to generate combinations (e.g. every 6 combo out of 10 numbers) in parallel using Golang.
I have a solution that runs serially: Serial Code
For the case where the number of items(n) = 3 and the sample size (r) = 2, the output is:
Got [1 2]
Got [1 3]
Got [2 3]
Now I have tried parallelizing this and here is that code: Parallel Code. It doesn't work and I don't know why. For the same problem the output is:
Put [3 3] into the channel.
Got [3 3] out of the channel.
Put [3 3] into the channel.
Got [3 3] out of the channel.
Put [3 3] into the channel.
Got [3 3] out of the channel.
Any help much appreciated.
First, there is a data race.
$ go run -race main.go
==================
WARNING: DATA RACE
Write at 0x00c0000b0018 by goroutine 11:
main.combinationUtil()
/home/leaf/spike/stackoverflow/concomb/main.go:33 +0x11e
Previous write at 0x00c0000b0018 by goroutine 9:
main.combinationUtil()
/home/leaf/spike/stackoverflow/concomb/main.go:33 +0x11e
Goroutine 11 (running) created at:
main.combinationUtil()
/home/leaf/spike/stackoverflow/concomb/main.go:35 +0x211
Goroutine 9 (running) created at:
main.combinationUtil()
/home/leaf/spike/stackoverflow/concomb/main.go:35 +0x211
==================
==================
WARNING: DATA RACE
Read at 0x00c0000b0010 by goroutine 12:
runtime.slicecopy()
/usr/lib/go-1.13/src/runtime/slice.go:197 +0x0
main.combinationUtil()
/home/leaf/spike/stackoverflow/concomb/main.go:21 +0x39c
Previous write at 0x00c0000b0010 by goroutine 10:
main.combinationUtil()
/home/leaf/spike/stackoverflow/concomb/main.go:33 +0x11e
Goroutine 12 (running) created at:
main.combinationUtil()
/home/leaf/spike/stackoverflow/concomb/main.go:35 +0x211
Goroutine 10 (running) created at:
main.combinationUtil()
/home/leaf/spike/stackoverflow/concomb/main.go:40 +0x2dc
==================
This data race is caused by writing to the underlying array of data
concurently - which means, underlying array of data
(and thus the content of data
) is shared on all goroutine. That is undesired.
In Go, what is under the hood of slice
is a slice header (see reflect.SliceHeader
), it keeps in three bytes the len
, cap
and ptr
to the underlying array. When you copy it, it does not copy the underlying array, but rather only the header, so the underlying array is shared and raced - neither is desired.
To avoid that, just make a new copy of it. (using sync techniques may avoid race, but the content is still shared). However, that alone is a costing operation, both in terms of time and space. Whether that will worth the benefit of having parellel (if there is any benefit), is another topic and out of scope of this answer.
For example,
newdata := make([]int, r)
copy(newdata, data)
// current is included, put next at next location
newdata[index] = arr[i]
wg.Add(1)
go combinationUtil(ch, wg, arr, n, r, index+1, newdata, i+1)
// current is excluded, replace it with next (Note that
// i+1 is passed, but index is not changed)
wg.Add(1)
go combinationUtil(ch, wg, arr, n, r, index, newdata, i+1)
playground: https://play.golang.org/p/YebyCGapSMs
Note 1: This simple addition of a copy works only because that the recursion does not relies on change of part of data
(data[index:]
). Otherwise, there needs to be a back-copy for newdata
to data
.
Note 2: As I implied before, I doubt how efficient is this kind of parellel. There can be other ways of calculating combinations in parellel, which may utilize power of parellel calculation better, but requires quite different algorithms. However, I am not certain of that, so you should do your own research if interested.