Right now I have my puma setup successfully running, with zero downtime deployment via phased restarts (served behind nginx).
It works pretty nice, since no connections are lost and the new versions are available seamlessly.
But: right now we are setting the process count to the amount of CPUs - 1, so for a big server with 32 CPUs there are 31 puma processes, which get restarted one by one during phased restarts. This takes really a long long long time for us (around 15 minutes, since each process takes about 30 seconds to boot (yeah, many gems, big system)).
I saw that the clustered mode can also be used for fast deployments with setting preload_app!
- but I can't understand what happends during the deployment:
Will current requests get dropped? Is there a small timeframe, where no new connections will be accepted? I tried to figure it out via the READMEs, but it wasn't clear to me what exactly happens.
Would be great to hear an explanation, thank you!
Ok, I tested it myself: new connections get stuck and you have to wait until your workers have finished loading. Existing connections are not closed. So for zero downtime deployments, you really need to do phased restarts (with the disadvantage that it takes pretty long if you have many workers).