How do I concurrently fetch from with a rate-limited API endpoint?

I can't wrap my head around this issue. I have a service that I need to pull data from, which has a rate-limit of 5 requests per second, even when using the x/rate/limit package and setting up rate.Limiter(5,1) and calling it before each request, it sometimes still hits the rate-limit and I need to back off from sending requests. It is possible that I am competing with another service that could interfere with the request budget, so I want to handle it better.

My problem is that I need to get around this issue, I process 5 requests at a time, but when one requests hits the rate limit, and the next one too, the server would at times add to the time I have to wait before sending another request. So having 5 requests going out, if one hits the rate limit, the chance is greater that the rest will also hit the rate limit and it will get stuck.

How can I effectively get around this issue? I need to reprocess the rate-limited requests by feeding them back to the workers. I am trying to stop all my workers when I hit the rate limit, back-off for the given delay, and then continue processing requests.

Below is some example mock code of what I have:

package main

import (
    "context"
    "log"
    "net/http"
    "strconv"
    "sync"
    "time"

    "golang.org/x/time/rate"
)

// Rate-limit => 5 req/s

const (
    workers = 5
)

func main() {
    ctx, cancel := context.WithCancel(context.Background())

    // Mock function to grab all the serials to use in upcoming requests.
    serials, err := getAllSerials(ctx)
    if err != nil {
        panic(err)
    }

    // Set up for concurrent processing.
    jobC := make(chan string)            // job queue
    delayC := make(chan int)             // channel to receive delay
    resultC := make(chan *http.Response) // channel for results

    var wg *sync.WaitGroup

    // Set up rate limiter.
    limiter := rate.NewLimiter(5, 1)

    for i := 0; i < workers; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()

            for s := range jobC {
                limiter.Wait(ctx)
                res, err := doSomeRequest(s)
                if err != nil {
                    // Handle error.
                    log.Println(err)
                }

                // Handle rate limit.
                if res.StatusCode == 429 {
                    delay, _ := strconv.Atoi(res.Header.Get("Retry-After"))
                    log.Println("rate limit hit, backing off")

                    // Back off.
                    delayC <- delay

                    // Put serial back into job queue.
                    jobC <- s
                }

                resultC <- res
            }
        }()
    }

    go processResults(ctx, resultC) // call goroutine to read results
    go backOffProcess(ctx, delayC)  // call goroutine to handle backing off

    for _, s := range serials {
        jobC <- s
    }

    wg.Wait()
    close(jobC)
    close(resultC)
    cancel()

    log.Println("Finished process")
}

func doSomeRequest(serial string) (*http.Response, error) {
    // do the request and send back the results
    // ...
    // handle error

    // mock response
    res := &http.Response{}
    return res, nil
}

func getAllSerials(ctx context.Context) []string {
    // Some stuff
    return []string{"a", "b", "c", "d", "e"}
}

func processResults(ctx context.Context, resultC chan *http.Response) {
    for {
        select {
        case r := <-resultC:
            log.Println("Processed result")
        case <-ctx.Done():
            close(resultC)
            return
        }
    }
}

func backOffProcess(ctx context.Context, delayC chan int) {
    for {
        select {
        case d := <-delayC:
            log.Println("Sleeping for", d, "seconds")
            time.Sleep(time.Duration(d) * time.Second)
        case <-ctx.Done():
            close(delayC)
            return
        }
    }
}

What I have noticed is that when 4/5 requests hit the rate limit, the backOffProcess would successfully sleep and delay (all be it for the summed time across all rate-limited requests, where it only has to be the latest, as it will have the new total duration to wait), but when all 5 would hit the rate limit, the workers get stuck and the backOffProcess does not read off of the channel.

What is a better way of achieving this?

Solution

I don't really understand why your backOffProcess is executed in a separate goroutine. I think every worker process should backoff (if it is needed) before executing task. I see it something like this:

 backOffUntil := time.Now()
 backOffMutex := sync.Mutex{}
 go func() {
            defer wg.Done()

            for s := range jobC {
                <-time.After(time.Until(backOffUntil))
                limiter.Wait(ctx)
                res, err := doSomeRequest(s)
                if err != nil {
                    // Handle error.
                    log.Println(err)
                }

                // Handle rate limit.
                if res.StatusCode == 429 {
                    delay, _ := strconv.Atoi(res.Header.Get("Retry-After"))
                    log.Println("rate limit hit, backing off")

                    // Back off.
                    newbackOffUntil := time.Now().Add(time.Second * delay)
                    backOffMutex.Lock()
                    if newbackOffUntil.Unix() > backOffUntil.Unix() {
                        backOffUntil = newbackOffUntil
                    }
                    backOffMutex.Unlock()

                    // Put serial back into job queue.
                    jobC <- s
                }

                resultC <- res
            }
        }()