Search code examples
gooptimizationparallel-processinggoroutineworker-pool

Optimal size of worker pool


I'm building a Go app which uses a "worker pool" of goroutines, initially I start the pool creating a number of workers. I was wondering what would be the optimal number of workers in a mult-core processor, for example in a CPU with 4 cores ? I'm currently using the following aproach:

    // init pool
    numCPUs := runtime.NumCPU()

    runtime.GOMAXPROCS(numCPUs + 1) // numCPUs hot threads + one for async tasks.
    maxWorkers := numCPUs * 4

    jobQueue := make(chan job.Job)

    module := Module{
        Dispatcher: job.NewWorkerPool(maxWorkers),
        JobQueue:   jobQueue,
        Router:     router,
    }

    // A buffered channel that we can send work requests on.
    module.Dispatcher.Run(jobQueue)

The complete implementation is under

job.NewWorkerPool(maxWorkers) and module.Dispatcher.Run(jobQueue)

My use-case for using a worker pool: I have a service which accepts requests and calls multiple external APIs and aggregate their results into a single response. Each call can be done independently from others as the order of results doesn't matter. I dispatch the calls to the worker pool where each call is done in one available goroutine in an asynchronous way. My "request" thread keeps listening to the return channels while fetching and aggregating results as soon as a worker thread is done. When all are done the final aggregated result is returned as a response. Since each external API call may render variable response times some calls can be completed earlier than others. As per my understanding doing it in a parallel way would be better in terms of performance as if compared to doing it in a synchronous way calling each external API one after another


Solution

  • The comments in your sample code suggest you may be conflating the two concepts of GOMAXPROCS and a worker pool. These two concepts are completely distinct in Go.

    1. GOMAXPROCS sets the maximum number of CPU threads the Go runtime will use. This defaults to the number of CPU cores found on the system, and should almost never be changed. The only time I can think of to change this would be if you wanted to explicitly limit a Go program to use fewer than the available CPUs for some reason, then you might set this to 1, for example, even when running on a 4-core CPU. This should only ever matter in rare situations.

      TL;DR; Never set runtime.GOMAXPROCS manually.

    2. Worker pools in Go are a set of goroutines, which handle jobs as they arrive. There are different ways of handling worker pools in Go.

      What number of workers should you use? There is no objective answer. Probably the only way to know is to benchmark various configurations until you find one that meets your requirements.

      As a simple case, suppose your worker pool is doing something very CPU-intensive. In this case, you probably want one worker per CPU.

      As a more likely example, though, lets say your workers are doing something more I/O bound--such as reading HTTP requests, or sending email via SMTP. In this case, you may reasonably handle dozens or even thousands of workers per CPU.

      And then there's also the question of if you even should use a worker pool. Most problems in Go do not require worker pools at all. I've worked on dozens of production Go programs, and never once used a worker pool in any of them. I've also written many times more one-time-use Go tools, and only used a worker pool maybe once.

    And finally, the only way in which GOMAXPROCS and worker pools relate is the same as how goroutines relates to GOMAXPROCS. From the docs:

    The GOMAXPROCS variable limits the number of operating system threads that can execute user-level Go code simultaneously. There is no limit to the number of threads that can be blocked in system calls on behalf of Go code; those do not count against the GOMAXPROCS limit. This package's GOMAXPROCS function queries and changes the limit.

    From this simple description, it's easy to see that there could be many more (potentially hundreds of thousands... or more) goroutines than GOMAXPROCS--GOMAXPROCS only limits how many "operating system threads that can execute user-level Go code simultaneously"--goroutines which aren't executing user-level Go code at the moment don't count. And in I/O-bound goroutines (such as those waiting for a network response) aren't executing code. So you have a theoretical maximum number of goroutines limited only by your system's available memory.