Search code examples
gochannelgoroutine

Broken output from a buffered channel range in a GO routine


Why a GO routine like the following outputs sequences of bytes in a random order when using a buffered channel?

Here is the code to replicate the buggy behaviour, where data.csv is a simple CSV of 1000 rows of random data (100 bytes per row approximately) plus the header row (1001 rows in total).

package main

import (
    "bufio"
    "os"
    "time"
)

func main() {

    var channelLength = 10000
    var channel = make(chan []byte, channelLength)

    go func() {
        for c := range channel {
            println(string(c))
        }
    }()

    file, _ := os.Open("./data.csv")
    scanner := bufio.NewScanner(file)

    for scanner.Scan() {
        channel <- scanner.Bytes()
    }

    <-time.After(time.Second * time.Duration(3600))

}

Here are the first 6 lines of the output as an example of what I mean for "broken output":

979,C
tharine,Vero,cveror6@blinklist.com,Female,133.153.12.53
980,Mauriz
a,Ilett,milettr7@theguardian.com,Female,226.123.252.118
981
Sher,De Laci,sdelacir8@nps.gov,Female,137.207.30.217
[...]

On the other hand, the code runs smoothly if channelLength = 0, so with an unbuffered channel (first 6 lines, again):

id,first_name,last_name,email,gender,ip_address
1,Hebert,Edgecumbe,hedgecumbe0@apple.com,Male,108.84.217.38
2,Minor,Lakes,mlakes1@marriott.com,Male,231.185.189.39
3,Faye,Spurdens,fspurdens2@oakley.com,Female,80.173.161.81
4,Kris,Proppers,kproppers3@gmpg.org,Male,10.80.182.51
5,Bronnie,Branchet,bbranchet4@squarespace.com,Male,118.117.0.5
[...]

Data is random generated.


Solution

  • From the buffer.Scanner docs:

    The underlying array may point to data that will be overwritten by a subsequent call to Scan

    You have a data race around the use of the slices you're passing over the channel. You need to copy the data you're sending. In this example, that is most easily accomplished by using a string instead of []byte, and calling scanner.Text