Search code examples
goconcurrencygoroutinechannel

goroutine inside infinite for loop. Is it a good practice?


So I'm working on a piece of code:

// Main
for {
    c := make(chan string)
    data := make(map[string]string)
    go doStuff(data,c)
    fmt.Println(<-c)

    time.Sleep(2*time.Second)
}

// doStuff
func doStuff(d map[string]string,ch chan string){
    defer close(ch)
    //Code to make changes to passed data
    ch <-"changes made"
}

What this does is it passes a map and a channel to a goroutine, inside which some changes are made in the map and it'll send message and in main it'll print and wait for another modification message and this goes on with interval of 2 seconds until keyboard interruption or some logic after processing the passed data to the goroutine.

I somewhere feel like this is not the efficient way to do it. So my question is whether placing a goroutine inside a infinite for loop is okay or is there any more efficient way to do this?


Solution

  • There's nothing wrong with an infinite loop per se. I often use for { ... } construct when the loop exit condition requires too many commands to be easily put into the for conditional.

    Based on my $GOPATH/src/github.com/ directory, which is obviously a quite incomplete sample set, I see hundreds of such uses beyond my own. Just github.com/docker/docker alone appears to use 454 such infinite loops.

    Less appropriate is the idea of creating a channel in a loop that only ever passes one value. If your goroutine always only returns one value, that return value's presence is sufficient indication that the goroutine is done. Reuse channels whenever possible, and don't close them if you want to send more data later.

    Obviously, in your case, the goroutine is pointless anyway, and just for educational purposes. But consider this, if you're so inclined:

    package main
    
    import (
      "log"
    )
    
    func doStuff(datachan <-chan map[string]string, reschan chan<- int) {
      for {
        data, ok := <-datachan
        if !ok {
          log.Print("Channel closed.")
          break
        }
        log.Printf("Data had %d length: %+v", len(data), data)
        reschan<-len(data)
      }
      return
    }
    
    const workers = 3
    
    func main() {
      var datachan = make(chan map[string]string)
      var reschan = make(chan int)
      var inflight = 0
      var inputs = []map[string]string {
        map[string]string{ "hi": "world" },
        map[string]string{ "bye": "space", "including": "moon" },
        map[string]string{ "bye": "space", "including": "moon" },
        map[string]string{ },
        map[string]string{ },
      }
      // an inline funciton definition can change inflight within main()'s scope
      processResults := func (res int) {
        log.Printf("Main function got result %d", res)
        inflight--
      }
      // start some workers
      for i := 0; i < workers; i++{
        go doStuff(datachan, reschan)
      }
      for _, data := range inputs {
          //Select allows reading from reschan if datachan is not available for
          // writing, thus freeing up a worker to read from datachan next loop
          written := false
          for written  != true {
            select {
              case res := <-reschan:
                processResults(res)
              case datachan <- data:
                inflight++
                written = true
            }
          }
      }
      close(datachan)
      for inflight > 0 {
        processResults(<-reschan)
      }
    }
    
    

    Output:

    2020/10/31 13:15:08 Data had 1 length: map[hi:world]
    2020/10/31 13:15:08 Main function got result 1
    2020/10/31 13:15:08 Data had 0 length: map[]
    2020/10/31 13:15:08 Main function got result 0
    2020/10/31 13:15:08 Data had 0 length: map[]
    2020/10/31 13:15:08 Channel closed.
    2020/10/31 13:15:08 Main function got result 0
    2020/10/31 13:15:08 Data had 2 length: map[bye:space including:moon]
    2020/10/31 13:15:08 Channel closed.
    2020/10/31 13:15:08 Main function got result 2
    2020/10/31 13:15:08 Data had 2 length: map[bye:space including:moon]
    2020/10/31 13:15:08 Channel closed.
    2020/10/31 13:15:08 Main function got result 2
    

    Here, I add a bit more structure to illustrate some more common usages of for { and close(chan).

    I use a potentially infinite loop within the worker goroutines, of which there are 3 (intentionally created more than are used). I count how many times I write to the channel to ensure that I read every response. When the main goroutine ends, all other goroutines are unceremoniously killed, so it's up to me to make sure I have let them complete. Counting results is one simple way to do that.

    I also demonstrate correct use of close(chan). While closing a channel after use, such as you have done, is not incorrect, it's usually unnecessary as open channels will be garbage collected after all references to them are gone anyway. (https://stackoverflow.com/questions/8593645/is-it-ok-to-leave-a-channel-open#:~:text=It's%20OK%20to%20leave%20a,that%20no%20more%20data%20follows.)

    close(chan) is usually used to tell the channel readers that no more data will be available on the channel.

        data, ok := <-datachan
    

    The second value, a boolean, will tell us whether we read data or the channel was actually closed and drained. So this is the receiver part of making sure we've processed all the channel.

    Because I use select, This code can process an inputs of arbitrary length, with a static set of workers. None of these channels are buffered - the reader must be reading for the writer to write. So I need to make sure to receive any results from a worker before I attempt to send another data input to that reader. Using select makes this trivial : the operation succeeds on whatever channel is ready first (if both channels are ready, a choice is selected at random - perfectly functional in this case).

    for {, close(chan) and select, in conclusion, work very well together when sending an unknown number of inputs to goroutine worker bools.

    A few final notes. In the real world, https://gobyexample.com/waitgroups would normally be used instead of implementing this all manually. The concept is generally the same, but it's a lot less keeping track of things and results in cleaner code. I implemented it myself so the concepts were clear.

    And finally, you'll note that there's nothing guaranteeing that the worker goroutines see the closed channel before the program ends. In practice, it's technically possible that the "closed channel" message might not be logged from all goroutines. But the use of inflight counter ensures that I got their results, even if they haven't had a chance to observe the channel's closure. Closing channels and exiting workers makes more sense when an application will continue running multiple batches of workers over time - if we didn't give them notice of closure, but then created more workers later, that would result in a memory leak as those workers would continue to wait for input that would never come. Alternately, its not unusual to use the same set of workers for multiple batches of requests.