Search code examples
project-reactorspring-webclient

Test using StepVerifier blocks when using Spring WebClient with retry


EDIT: here https://github.com/wujek-srujek/reactor-retry-test is a repository with all the code.

I have the following Spring WebClient code to POST to a remote server (Kotlin code without imports for brevity):

private val logger = KotlinLogging.logger {}

@Component
class Client(private val webClient: WebClient) {

    companion object {
        const val maxRetries = 2L
        val firstBackOff = Duration.ofSeconds(5L)
        val maxBackOff = Duration.ofSeconds(20L)
    }

    fun send(uri: URI, data: Data): Mono<Void> {
        return webClient
            .post()
            .uri(uri)
            .contentType(MediaType.APPLICATION_JSON)
            .bodyValue(data)
            .retrieve()
            .toBodilessEntity()
            .doOnSubscribe {
                logger.info { "Calling backend, uri: $uri" }
            }
            .retryExponentialBackoff(maxRetries, firstBackOff, maxBackOff, jitter = false) {
                logger.debug { "Call to $uri failed, will retry (#${it.iteration()} of max $maxRetries)" }
            }
            .doOnError {
                logger.error { "Call to $uri with $maxRetries retries failed with $it" }
            }
            .doOnSuccess {
                logger.info { "Call to $uri succeeded" }
            }
            .then()
    }
}

(It returns an empty Mono as we don't expect an answer, nor do we care about it.)

I would like to test 2 cases, and one of them is giving me headaches, namely the one in which I want to test that all the retries have been fired. We are using MockWebServer (https://github.com/square/okhttp/tree/master/mockwebserver) and the StepVerifier from reactor-test. (The test for success is easy and doesn't need any virtual time scheduler magic, and works just fine.) Here is the code for the failing one:

@JsonTest
@ContextConfiguration(classes = [Client::class, ClientConfiguration::class])
class ClientITest @Autowired constructor(
    private val client: Client
) {
    lateinit var server: MockWebServer

    @BeforeEach
    fun `init mock server`() {
        server = MockWebServer()
        server.start()
    }

    @AfterEach
    fun `shutdown server`() {
        server.shutdown()
    }

   @Test
   fun `server call is retried and eventually fails`() {
       val data = Data()
       val uri = server.url("/server").uri()
       val responseStatus = HttpStatus.INTERNAL_SERVER_ERROR

       repeat((0..Client.maxRetries).count()) {
           server.enqueue(MockResponse().setResponseCode(responseStatus.value()))
       }

       StepVerifier.withVirtualTime { client.send(uri, data) }
           .expectSubscription()
           .thenAwait(Duration.ofSeconds(10)) // wait for the first retry
           .expectNextCount(0)
           .thenAwait(Duration.ofSeconds(20)) // wait for the second retry
           .expectNextCount(0)
           .expectErrorMatches {
               val cause = it.cause
               it is RetryExhaustedException &&
                       cause is WebClientResponseException &&
                       cause.statusCode == responseStatus
           }
           .verify()

       // assertions
       }
   }

I am using withVirtualTime because I don't want the test to take nearly seconds. The problem is that the test blocks indefinitely. Here is the (simplified) log output:

okhttp3.mockwebserver.MockWebServer      : MockWebServer[51058] starting to accept connections
Calling backend, uri: http://localhost:51058/server
MockWebServer[51058] received request: POST /server HTTP/1.1 and responded: HTTP/1.1 500 Server Error
Call to http://localhost:51058/server failed, will retry (#1 of max 2)
Calling backend, uri: http://localhost:51058/server
MockWebServer[51058] received request: POST /server HTTP/1.1 and responded: HTTP/1.1 500 Server Error
Call to http://localhost:51058/server failed, will retry (#2 of max 2)

As you can see, the first retry works, but the second one blocks. I don't know how to write the test so that it doesn't happen. To make matters worse, the client will actually use jitter, which will make the timing hard to anticipate.

The following test using StepVerifier but without WebClient works fine, even with more retries:

@Test
fun test() {
    StepVerifier.withVirtualTime {
        Mono
            .error<RuntimeException>(RuntimeException())
            .retryExponentialBackoff(5,
                                     Duration.ofSeconds(5),
                                     Duration.ofMinutes(2),
                                     jitter = true) {
                println("Retrying")
            }
            .then()
    }
        .expectSubscription()
        .thenAwait(Duration.ofDays(1)) // doesn't matter
        .expectNextCount(0)
        .expectError()
        .verify()
}

Could anybody help me fix the test, and ideally, explain what is wrong?


Solution

  • This is a limitation of virtual time and the way the clock is manipulated in StepVerifier. The thenAwait methods are not synchronized with the underlying scheduling (that happens for example as part of the retryBackoff operation). This means that the operator submits retry tasks at a point where the clock has already been advanced by one day. So the second retry is scheduled for + 1 day and 10 seconds, since the clock is at +1 day. After that, the clock is never advanced so the additional request is never made to MockWebServer.

    Your case is made even more complicated in the sense that there is an additional component involved, the MockWebServer, that still works "in real time". Though advancing the virtual clock is a very quick operation, the response from the MockWebServer still goes through a socket and thus has some amount of latency to the retry scheduling, which makes things more complicated from the test writing perspective.

    One possible solution to explore would be to externalize the creation of the VirtualTimeScheduler and tie advanceTimeBy calls to the mockServer.takeRequest(), in a parallel thread.