Search code examples
netflix-eurekaspring-cloud-netflixservice-discovery

How to tune Netflix eureka self preservation to handle autoscaling?


The self-preservation feature that never expires does not looks friendly to cluster auto-scaling ability. When we scale down our services after reduced load thous shutted down instances could trigger self-preservation.

As I understand self-preservation tries to tolerate short-term network issues. But there are already exists settings which allow us to tune some tolerance window:

eureka.instance.lease-expiration-duration-in-seconds = 90
eureka.instance.lease-renewal-interval-in-seconds = 30

I faced some advises to don't turn self-preservation off but seems it brings more pain than gain. Do I miss something?


Solution

  • First, you need to distinguish between normal shutdown and unclean termination of Eureka client. Self preservation mode only cares about unclean termination.

    Namely, when you scale down your servers, if you make your application shutdown normally (unregister), self preservation mode will not be activated.

    If you're using Spring cloud based Eureka client, this normal shutdown will be done when application shutdown. The problem is that some Spring cloud releases have the issue about sending shutdown(Eureka unregister) message. So if you want to make sure, just send unregister messages via REST API to Eureka server just after scaling down about the scaling downed instances.

    Another possible approach is that just decreasing the threshold for self preservation.

    eureka:
      server:
        renewal-percent-threshold: 0.50
    

    One more thing. You need to be careful when change eureka.instance.leaseRenewalIntervalInSeconds value. Original Eureka server source code assumes that this value is 30 seconds when it calculates the threshold for self preservation mode. I'm not sure this hard-coded part still lives in the latest Spring cloud release. You need double check.