I have a Spring Cloud Gateway application, which uses Consul to define routes based on the services registered in Consul.
I noticed that Spring configures RouteLocator
to be CachingRouteLocator
by default, but this cache is never invalidated or refreshed.
As a result, the cached route locator contains stale routes if a service gets registered or deregistered in Consul.
Is there a way to make it receptive to Consul changes or maybe override and replace the implementation somehow to be updated periodically?
I found something related to it saying the following:
The CachingRouteLocator
caches routes and refreshes them based on a certain interval defined in the property spring.cloud.gateway.route-cache-timeout-millis
, with a default value of 5000ms or 5 seconds.
Unfortunately, this doesn't seem to have any effect and it only starts refreshing if I configure a HeartbeatEvent
listener with a call like this: ((CachingRouteLocator) routeLocator).refresh()
.
Obviously it's not ideal as if the implementation changes, this call would not be made. Moreover, it happens way too often and may affect performance.
I wonder what's the right way to approach it. I'm pretty sure the problem has to be solved for any solution that utilizes Consul or another route definition provider that can get updated.
UPDATE: There's this method in CachingRouteLocator
that is probably supposed to do the job:
public void onApplicationEvent(RefreshRoutesEvent event) {
try {
fetch().collect(Collectors.toList()).subscribe(
list -> Flux.fromIterable(list).materialize().collect(Collectors.toList()).subscribe(signals -> {
applicationEventPublisher.publishEvent(new RefreshRoutesResultEvent(this));
cache.put(CACHE_KEY, signals);
}, this::handleRefreshError), this::handleRefreshError);
}
catch (Throwable e) {
handleRefreshError(e);
}
}
For some reason the implementation of this method does not clear cache and tries to do something funkier. As a result it does not work properly even though it gets called upon service registration. So calling fetch() does not do the right job while clearing cache completely seems to be doing it. I wonder if there's a bug in that listener or there's something else that needs to be done for the listener to function properly.
I encountered exactly the same problem in my application. This GitHub issue also mentions exactly the same problem together with a workaround, which unfortunately does not seem to work for spring-cloud-gateway-server 4.0.9
, at least for me.
At this point in time I don't really see a "clean" solution to this problem as it seems to be a bug in Spring Cloud Gateway. My admittedly not very clean workaround for this is to regularly call refresh()
on the AbstractGatewayControllerEndpoint
in a scheduled task (which does the same as triggering the "Refreshing the Route Cache" Actuator API):
@Component
public class RefreshGatewayCacheTask {
private final AbstractGatewayControllerEndpoint gatewayControllerEndpoint;
public RefreshGatewayCacheTask(AbstractGatewayControllerEndpoint gatewayControllerEndpoint) {
this.gatewayControllerEndpoint = gatewayControllerEndpoint;
}
@Scheduled(fixedDelay = 30000) // can be any value you would like
public void refreshGatewayCache() {
// log something like "Scheduled Task: Refreshing the gateway cache"
this.gatewayControllerEndpoint.refresh(Collections.emptyList()).subscribe();
}
}
For this to work you have to set the following property: management.endpoint.gateway.enabled=true
so the AbstractGatewayControllerEndpoint
gets created and can be used.
This reguarly refreshes the gateway cache, which may be stale at this point in time, which fixes the problem and the gateway connects to the newly deployed services within Consul successfully after the refresh.
In first tests this seems to work, but I am not really sure about potential performance issues or otherwise.