Search code examples
goconcurrencyslicegoroutine

How to write a concurrent for loop in Go with sync/errgroup package


I would like to concurrently perform an operation on the elements of a slice
I am using the sync/errgroup package to handle concurrency

Here is a minimal reproduction on Go Playground https://go.dev/play/p/yBCiy8UW_80

import (
    "fmt"
    "golang.org/x/sync/errgroup"
)

func main() {
    eg := errgroup.Group{}
    input := []int{0, 1, 2}
    output1 := []int{}
    output2 := make([]int, len(input))
    for i, n := range input {
        eg.Go(func() (err error) {
            output1 = append(output1, n+1)
            output2[i] = n + 1
            return nil
        })
    }
    eg.Wait()
    fmt.Printf("with append %+v", output1)
    fmt.Println()
    fmt.Printf("with make %+v", output2)
}

outputs

with append [3 3 3]
with make [0 0 3]

versus expected [1 2 3]


Solution

  • You have two separate issues going on here:


    First, the variables in your loop are changing before each goroutine gets a chance to read them. When you have a loop like

    for i, n, := range input {
      // ...
    }
    

    the variables i and n live for the whole duration of the loop. When control reaches the bottom of the loop and jumps back up to the top, those variables get assigned new values. If a goroutine started in the loop is using those variables, then their value will change unpredictably. This is why you are seeing the same number show up multiple times in the output of your example. The goroutine started in the first loop iteration doesn't start executing until n has already been set to 2.

    To solve this, you can do what NotX's answer shows and create new variables that are scoped to just a single iteration of the loop:

    for i, n := range input {
      ic, nc := i, n
      // use ic and nc instead of i and n
    }
    

    Variables declared inside a loop are scoped to just a single iteration of the loop, so when the next iteration of the loop starts, entirely new variables get created, preventing the originals from changing between when the goroutine is launched and when it actually starts running.

    Update: As of Go 1.22, this is no longer an issue because loop variables have been changed to have per-iteration scope. You can see the details in the release notes.


    Second you are concurrently modifying the same value from different goroutines, which isn't safe. In particular, you're using append to append to the same slice concurrently. What happens in this case is undefined and all kinds of bad things could happen.

    There are two ways to deal with this. The first one you already have set up: pre-allocate an output slice with make and then have each goroutine fill in a specific position in the slice:

    output := make([]int, 3)
    for i, n := range input {
      ic, nc := i, n
      eg.Go(func() (err error) {
        output[ic] = nc + 1
        return nil
      })
    }
    eg.Wait()
    

    This works great if you know how many outputs you're going to have when you start the loop.

    The other option is to use some kind of locking to control access to the output slice. sync.Mutex works great for this:

    var output []int
    mu sync.Mutex
    for _, n := range input {
      nc := n
      eg.Go(func() (err error) {
        mu.Lock()
        defer mu.Unlock()
        output = append(output, nc+1)
        return nil
      })
    }
    eg.Wait()
    

    This works if you don't know how many outputs you have, but it doesn't guarantee anything about the order of the output - it could be in any order. If you want to put it in order, you can always do some kind of sort after all the goroutines finish.