Search code examples
mesosmarathonmesospheredcos

how to scale down instances based on their uptime with apache marathon?


I find myself in a situation where I have the necessity to scale down container instances based on their actual lifetime. It looks like fresh instances are removed first when scaling down through marathon's API. Is there any configuration I'm not aware of to implement this kind of strategy or policy when scaling down instances on apache marathon?

As of right now I'm using marathon-lb-autoscale to atumatically adjust the number of running instances. What actually happens under the hood though is that marathon-lb-autoscale does perform a PUT request updating the instances property of the current application when req/s increases or decreaseas.

scale_list.each do |app,instances|
    req = Net::HTTP::Put.new('/v2/apps/' + app)
    if !@options.marathonCredentials.empty?
      req.basic_auth(@options.marathonCredentials[0], @options.marathonCredentials[1])
    end
    req.content_type = 'application/json'
    req.body = JSON.generate({'instances'=>instances})
    Net::HTTP.new(@options.marathon.host, @options.marathon.port).start do |http|
      http.request(req)
    end
  end
end

I don't know if the upgradeStrategy configuration is taken into account when scaling down instances. With default settings i cannot get the expected behaviour to work.

{
  "upgradeStrategy": {
    "minimumHealthCapacity": 1,
    "maximumOverCapacity": 1
  }
}

ACTUAL

  • instance 1
  • instance 2
  • PUT /v2/apps/my-app {instances: 3}
  • instance 1
  • instance 2
  • instance 3
  • PUT /v2/apps/my-app {instances: 2}
  • instance 1
  • instance 2

EXPECTED

  • instance 1
  • instance 2
  • PUT /v2/apps/my-app {instances: 3}
  • instance 1
  • instance 2
  • instance 3
  • PUT /v2/apps/my-app {instances: 2}
  • instance 2
  • instance 3

Solution

  • One can specify a killSelection directly inside the application's config and specify YoungestFirst which kills youngest tasks first or OldestFirst which kills the oldest ones first.

    Reference: https://mesosphere.github.io/marathon/docs/configure-task-handling.html