We have two AOS servers, one takes less than a minute for the AOS service to stop, the other takes 25-30 minutes to stop, consistently.
I'm thinking this is not expected (reset my expectations if needed). I'd like to find out what is going on during this time, is there a log file somewhere that details what is going on at any given time? I checked event log, nothing there.
When the AOS stops, it tries to do it gracefully so that transactions are committed or aborted (and rolled back).
When the AOS is in a stopping
state, it is likely not starting newly created batches (leaving in waiting
state) and waiting for executing
(in-progress) ones to complete (Ended
status). I suspect it also has some sort of set timeout where it will abort/kill and continue shutting down.
If you want to test this, swap the batch server AOS's and see. Also you can check what batches are running when you go to stop the AOS and see if there are any long-running ones.