I was studying a blog about the timing of using go-routines, and I saw the example pasted below, from line 61 to line 65. But I don't get the purpose of using channel here.
It seems that he is iterating the channels to retrieve the msg inside go-routine. But why not directly using string array?
58 func findConcurrent(goroutines int, topic string, docs []string) int {
59 var found int64
60
61 ch := make(chan string, len(docs))
62 for _, doc := range docs {
63 ch <- doc
64 }
65 close(ch)
66
67 var wg sync.WaitGroup
68 wg.Add(goroutines)
69
70 for g := 0; g < goroutines; g++ {
71 go func() {
72 var lFound int64
73 for doc := range ch {
74 items, err := read(doc)
75 if err != nil {
76 continue
77 }
78 for _, item := range items {
79 if strings.Contains(item.Description, topic) {
80 lFound++
81 }
82 }
83 }
84 atomic.AddInt64(&found, lFound)
85 wg.Done()
86 }()
87 }
88
89 wg.Wait()
90
91 return int(found)
92 }
This code is providing an example of a way of distributing work (finding strings within documents) amongst multiple goRoutines. Basically the code is starting goroutines
and feeding them documents to search via a channel.
But why not directly using string array?
It would be possible to use a string array and a variable (lets call it count
) to track what item in the array you were up to. You would have some code like (a little long winded to demonstrate a point):
for {
if count > len(docarray) {
break;
}
doc := docarray[count]
count++
// Process the document
}
However you would hit syncronisation issues. For example what happens if two go routines (running on different processor cores) get to if count > len(docarray)
at the same time? Without something to prevent this they might both end up processing the same item in the slice (and potentially skipping the next element because they both run count++
).
Syncronization of processes is complex and issues can be very hard to debug. Using channels hides a lot of this complexity from you and makes it more likely that your code will work as expected (it does not solve all issues; note the use of atomic.AddInt64(&found, lFound)
in the example code to prevent another potential issue that would result from multiple go routines writing to a variable at the same time).