Search code examples
cloudcloud-foundryautoscaling

Ensure graceful scale down with PCF Autoscaler


I have a PCF spring boot application, which handles some long running batches, and planning to use PCF AutoScaler for automatic scaling.

How to ensure graceful scale-down if app takes approx 15-20 minutes to drain already running jobs? Is there any mechanism to control scale down time so that app instance gets enough time before it's shut down?


Solution

  • The amount of time an application has to gracefully shutdown is set by the platform, not by PCF Autoscaler or your application. With PCF it is going to default to 10 seconds.

    By default, apps must finish their in-flight jobs within ten seconds of receiving the SIGTERM before TAS for VMs terminates the app with a SIGKILL.

    https://docs.pivotal.io/application-service/2-12/devguide/deploy-apps/app-lifecycle.html#shutdown

    As noted at that link, you can change that value but it's platform-wide.

    To modify the timeout period on the TAS for VMs tile or IST tile, go to the Advanced Settings tab and edit the “app graceful shutdown period” property.

    There are downsides to changing it, like longer evacuation times when updating Diego Cells. You're essentially giving all of your applications that much time to run before the platform can stop the application, which ties up platform resources for longer periods of time.

    I would not recommend increasing the value platform-wide to accommodate your need of 15-20 minutes.

    My suggestion would be to adjust your batch jobs such that they can be paused or stopped and resumed later. When the SIGTERM is sent, you can then tell the jobs to pause or stop, which can hopefully be done in less than 10 seconds. When your application restarts, the paused or stopped job can be resumed.