As part of a new feature my team is adding, I was asked that when a specific HTTP Post request fails (When there is slow to no internet available), I will retry the request every X seconds over a total span period of Y seconds. For example, every 1 second in a period of 8 seconds. This is the code I came out with:
return this.carService.saveCar(car)
.pipe(this.isRetryable() ? retryWhen(errors => {
return errors.pipe(mergeMap(response => {
if (response.status === TIMEOUT_EXCEPTION_CODE || response.status === UNKNOWN_EXCEPTION_CODE) {
return of(response).pipe(delay(this.saveCarRetryTimeout)) // X seconds
}
return throwError(response);
}));
}) : tap(),
this.isRetryable() ? timeout(this.saveCarProccessTime) : tap(), // Y Seconds
tap((carId: number) => {
this.logger.info(saveCarBaseLog.setStatus(LogStatus.COMPLETE));
}),
catchError((err) => {
this.logger.error(saveCarBaseLog).setStatus(LogStatus.FAILED));
return throwError(err);
}));
The isRetryable() function just checks if we have both X and Y configurations set, so it won't affect the process.
After doing so and seeing that it works good both local and in the development environment, we uploaded the version. The next day we encountered a problem - In the preprod and prod environments, some cars are saved twice. After an investigation I made, it looks like this problem comes from the service worker we have - whenever a full timeout occurs, the request itself timeouts, although the FETCH request associated with it is never cancelled, which causes a problem when the internet is just slow (the FETCH request eventually succeeds, and we don't get any indication about it). I'm really lost on what to do here so any help is welcome!
I can't upload a screenshot of the network since it's a private network, But in the network section in chrome it looks like this:
POST request - saveCar - XHR - 504 Timeout
POST request - (ServiceWorker) saveCar - FETCH - 504 Timeout
POST request - saveCar - XHR - 504 Timeout
POST request - (ServiceWorker) saveCar - FETCH - 504 Timeout
POST request - saveCar - XHR - 504 Timeout
POST request - (ServiceWorker) saveCar - FETCH - 200 Success (The problematic one)
After simplifying a bit, here is how I understand the logic you've implemented so far:
class arbitraryClass{
/* ... arbitrary code ... */
arbitraryMethod(){
return this.carService.saveCar(car).pipe(
this.retryTimeoutLogic(),
tap({
complete: () => this.logger.info(saveCarBaseLog.setStatus(LogStatus.COMPLETE)),
error: err => this.logger.error(saveCarBaseLog.setStatus(LogStatus.FAILED))
})
);
}
retryTimeoutLogic<T>(): MonoTypeOperatorFunction<T> {
return s => !this.isRetryable() ? s : s.pipe(
retryWhen(errors => errors.pipe(
filter(response => {
if (response.status !== TIMEOUT_EXCEPTION_CODE ||
response.status !== UNKNOWN_EXCEPTION_CODE
) {
throw response;
}
return true;
}),
delay(this.saveCarRetryTimeout) // retry after X seconds
)),
timeout(this.saveCarProccessTime) // error after Y Seconds
);
}
}
The issue is what happens when timeout(this.saveCarProccessTime)
throws an error. It calls unsubscribe
on the source then emits an error downstream
.
This means that this.carService.saveCar(car)
needs an unsubscribe method that can cancel mid-flight or even recently completed requests (due to how async stuff might order itself).
You need to look there.
... or
Never cancel in-flight requests. The server holds the single source of truth, always assume saveCar
may still succeed if you've not been told upstream
(by the server), that it has failed.
Just stop retrying after y seconds. You'll get another 504 Timeout and you can throw that to the consumer directly to handle however they decide (presumably the same way they would if this.isRetryable()
returned false).
retryTimeoutLogic<T>(): MonoTypeOperatorFunction<T> {
return s => !this.isRetryable() ? s : defer(() => {
// The time when we stop retrying on TIMEOUT_EXCEPTION_CODE &
// UNKNOWN_EXCEPTION_CODE. After this time we rethrow instead
const timeoutDate = new Date().getMilliseconds() + this.saveCarProccessTime
return s.pipe(
retryWhen(errors => errors.pipe(
filter(response => {
if (response.status !== TIMEOUT_EXCEPTION_CODE ||
response.status !== UNKNOWN_EXCEPTION_CODE ||
new Date().getMilliseconds() > timeoutDate
) {
throw response;
}
return true;
}),
delay(this.saveCarRetryTimeout) // retry after X seconds
))
);
})
}