Cleanest way to kill Drake simulation from another process

In an example such as examples/allegro_hand, where a main thread advances the simulator and another sends commands to it over LCM, what's the cleanest way for each process to kill the other?

I'm struggling to kill the side process when the main process dies. I've wrapped the AdvanceTo with a try, and catch the error thrown when

MultibodyPlant's discrete update solver failed to converge

I can manually publish a boolean with drake::lcm::Publish within the catch block. In the side process, I subscribe and use something like this HandleStatus to process incoming messages. The corresponding HandleStatus isn't called unless I add a while(0 == lcm_.handleTimeout(10)) like this. When I do, the side process gets stuck waiting for a message, which doesn't come unless the simulation throws. Any advice for how to handle this case?

I'm able to kill the main process (allegro_single_object_simulation) by sending a boolean over LCM from the other (run_twisting_mug), AdvanceTo-ing to a smaller timestep within the main process, and checking the received boolean after each of the smaller AdvanceTos. This seems to work reliably, but may not be the cleanest solution.

If I'm thinking about this the wrong way and there's a better way to run an example like this, please let me know. Thanks!

Solution

We often use a process manager, like https://github.com/RobotLocomotion/libbot/tree/master/bot2-procman to start and manage all of our processes. The ROS ecosystem has similar tools. procman is open and available for you to use, but we don't consider it officially "supported" by the drake developers.