First time with go, and trying to get go routines and WaitGroups
working.
I have a CSV file with 100 rows of data. (101 including header)
I have the following simple code:
package main
import (
"bufio"
"fmt"
"io"
"os"
"sync"
"time"
)
func main() {
start := time.Now()
numRows := 0
waitGroup := sync.WaitGroup{}
file, _ := os.Open("./data.csv")
scanner := bufio.NewScanner(file)
scanner.Scan() // to read the header
for scanner.Scan() {
err := scanner.Err()
if err != nil && err != io.EOF {
panic(err)
}
waitGroup.Add(1)
go (func() {
numRows++
waitGroup.Done()
})()
}
waitGroup.Wait()
file.Close()
fmt.Println("Finished parsing ", numRows)
fmt.Println("Elapsed time in seconds: ", time.Now().Sub(start))
}
When i run this, the numRows
output fluctuates between 94 and 100 each time. I'm expecting it to be 100 each time. If i run the same code on a CSV of 10 rows of data, it outputs 10
each and every time.
Seems to me like the final few go routines aren't finishing in time.
I've tried the following which have failed:
CsvReader
instead of a Scanner
waitGroup.Add(1)
to underneath the anonymous funcWhat am i missing?
What do you do with this code:
for scanner.Scan() {
err := scanner.Err()
if err != nil && err != io.EOF {
panic(err)
}
waitGroup.Add(1)
go (func() {
numRows++
waitGroup.Done()
})()
}
Really all the work is done in one main goroutine and only numRows
increment uses separate goroutines. I think this could be simplified to simple increment:
for scanner.Scan() {
err := scanner.Err()
if err != nil && err != io.EOF {
panic(err)
}
numRows++
}
If you want to simulate parallel parsing and pipelining you may use channels. Make only one goroutine responsible for counter increment. Every time when another goroutine wants to increment counter - it sends a message to that channel.