I am trying to make function with worker pool and without, after that I create Benchmark test to compare with which want faster, but I got result that function with worker pool take longer than without.
here is the result
goos: linux
goarch: amd64
BenchmarkWithoutWorker-4 4561 228291 ns/op 13953 B/op 1744 allocs/op
BenchmarkWithWorker-4 1561 651845 ns/op 54429 B/op 2746 allocs/op
the worker pool looks simple and I am following the example from this stackoverflow question here is the scenario of my worker pool and without
var wg sync.WaitGroup
// i will get data from the DB, let say the data lenght about 1000
const dataFromDB int = 1000
// numOfProduce in benchmarking value is dataFromDB i defined
func WithoutWorker(numOfProduce int) {
for i := 0; i < numOfProduce; i++ {
if doSomething(fmt.Sprintf("data %d", i)) != nil {
fmt.Println("error")
}
}
}
func WithWorker(numWorker int) {
jobs := make(chan *Job, dataFromDB)
result := make(chan *Result, 10)
for i := 0; i < numWorker; i++ {
wg.Add(1)
go consume(i, jobs, result)
}
go produce(jobs)
wg.Wait()
// i might analyze the result channel
// here later to return any error to client if any error i got
}
func doSomething(str string) error {
if str == "" {
return errors.New("empty")
}
return nil
}
func consume(workerID int, jobs <-chan *Job, result chan<- *Result) {
defer wg.Done()
for job := range jobs {
//log.Printf("worker %d", workerID)
//log.Printf("job %v", job.ValueJob)
err := doSomething(job.ValueJob)
if err != nil {
result <- &Result{Err: err}
}
}
}
func produce(jobs chan<- *Job) {
for i := 1; i < dataFromDB; i++ {
jobs <- &Job{
Id: i,
ValueJob: fmt.Sprintf("data %d", i),
}
}
close(jobs)
}
am I missing something in my worker pool?
for the benchmark test code, it looks like codes from tutorial outs there :) just simple codes to call the functions and I added b.ReportAllocs()
as well
If the work you are splitting up on several goroutines / workers is less than the overhead of the communication to send the job to the goroutine and receive the result, then it is faster to do the work on a single machine.
In your example you are doing (almost) no work:
func doSomething(str string) error {
if str == "" {
return errors.New("empty")
}
return nil
}
Splitting that up on multiple goroutines is going to slow things down.
Example to illustrate:
If you have work that needs 5ns (nano seconds) and you do that 1000 times you have
0.005ms on a single core
If you distribute it across 10 cores it will add communication overhead for each job. Let's say the communication overhead is 1 micro second (1000ns). Now you have 1000 jobs * (5ns + 1000ns) / 10 cores =
0.1005ms on 10 cores
This is just an example with some made up numbers and the math is not exact, but it should illustrate the point: There is a cost to communication that is only worth introducing, if it is (significantly) smaller than the cost of the job itself.