Search code examples
httpgoerror-handlingtimeoutretry-logic

How to customize http.Client or http.Transport in Go to retry after timeout?


I want to implement a custom http.Transport for standard http.Client, which will retry automatically if the client got timeout.

P.S. for some reason, the custom http.Transport is a must-have. I've already checked hashicorp/go-retryablehttp, however it won't let me use my own http.Transport.

Here's my attempts, the custom component:

type CustomTransport struct {
    http.RoundTripper
    // ... private fields
}

func NewCustomTransport(upstream *http.Transport) *CustomTransport {
    upstream.TLSClientConfig = &tls.Config{InsecureSkipVerify: true}
    // ... other customizations for transport
    return &CustomTransport{upstream}
}

func (ct *CustomTransport) RoundTrip(req *http.Request) (resp *http.Response, err error) {
    req.Header.Set("Secret", "Blah blah blah")
    // ... other customizations for each request

    for i := 1; i <= 5; i++ {
        resp, err = ct.RoundTripper.RoundTrip(req)
        if errors.Is(err, context.DeadlineExceeded) {
            log.Warnf("#%d got timeout will retry - %v", i, err)
            //time.Sleep(time.Duration(100*i) * time.Millisecond)
            continue
        } else {
            break
        }
    }

    log.Debugf("got final result: %v", err)
    return resp, err
}

The caller code:

func main() {
    transport := NewCustomTransport(http.DefaultTransport.(*http.Transport))
    client := &http.Client{
        Timeout:   8 * time.Second,
        Transport: transport,
    }

    apiUrl := "https://httpbin.org/delay/10"

    log.Debugf("begin to get %q", apiUrl)
    start := time.Now()
    resp, err := client.Get(apiUrl)
    if err != nil {
        log.Warnf("client got error: %v", err)
    } else {
        defer resp.Body.Close()
    }
    log.Debugf("end to get %q, time cost: %v", apiUrl, time.Since(start))

    if resp != nil {
        data, err := httputil.DumpResponse(resp, true)
        if err != nil {
            log.Warnf("fail to dump resp: %v", err)
        }
        fmt.Println(string(data))
    }
}

My implementations didn't work as expected, once it got the client timeout, the retry won't actually happen. See the log below:

2020-07-15T00:53:22.586 DEBUG   begin to get "https://httpbin.org/delay/10"
2020-07-15T00:53:30.590 WARN    #1 got timeout will retry - context deadline exceeded
2020-07-15T00:53:30.590 WARN    #2 got timeout will retry - context deadline exceeded
2020-07-15T00:53:30.590 WARN    #3 got timeout will retry - context deadline exceeded
2020-07-15T00:53:30.590 WARN    #4 got timeout will retry - context deadline exceeded
2020-07-15T00:53:30.590 WARN    #5 got timeout will retry - context deadline exceeded
2020-07-15T00:53:30.590 DEBUG   got final result: context deadline exceeded
2020-07-15T00:53:30.590 WARN    client got error: Get "https://httpbin.org/delay/10": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2020-07-15T00:53:30.590 DEBUG   end to get "https://httpbin.org/delay/10", time cost: 8.004182786s

Can you please tell me how to fix this, or any methods/ideas to implement such a http.Client?


Solution

  • Note that the Timeout field of http.Client is more or less obsolete. Best practice now is to use http.Request.Context() for timeouts. – Flimzy

    Thanks for the inspiration from @Flimzy! I attempted to use context for timeout control instead of http.Client way. Here's the code:

    func (ct *CustomTransport) RoundTrip(req *http.Request) (resp *http.Response, err error) {
        req.Header.Set("Secret", "Blah blah blah")
        // ... other customizations for each request
    
        for i := 1; i <= 5; i++ {
            ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
            defer cancel()
            //reqT := req.WithContext(ctx)
            resp, err = ct.RoundTripper.RoundTrip(req.WithContext(ctx))
            if errors.Is(err, context.DeadlineExceeded) {
                log.Warnf("#%d got timeout will retry - %v", i, err)
                //time.Sleep(time.Duration(100*i) * time.Millisecond)
                continue
            } else {
                break
            }
        }
    

    As per the log, it works (note the timestamp in the logs, it actually retried):

    2020-07-16T00:06:12.788+0800    DEBUG   begin to get "https://httpbin.org/delay/10"
    2020-07-16T00:06:20.794+0800    WARN    #1 got timeout will retry - context deadline exceeded
    2020-07-16T00:06:28.794+0800    WARN    #2 got timeout will retry - context deadline exceeded
    2020-07-16T00:06:36.799+0800    WARN    #3 got timeout will retry - context deadline exceeded
    2020-07-16T00:06:44.803+0800    WARN    #4 got timeout will retry - context deadline exceeded
    2020-07-16T00:06:52.809+0800    WARN    #5 got timeout will retry - context deadline exceeded
    2020-07-16T00:06:52.809+0800    DEBUG   got final result: context deadline exceeded
    2020-07-16T00:06:52.809+0800    WARN    client got error: Get "https://httpbin.org/delay/10": context deadline exceeded
    2020-07-16T00:06:52.809+0800    DEBUG   end to get "https://httpbin.org/delay/10", time cost: 40.019334668s