google-cloud-platform google-cloud-dataproc

Possibility to catch a dataproc kill signal on a spark streaming job

I am looking for a way to catch a dataproc job kill signal in a python spark streaming job: I have one specific job on dataproc which opens several connections to a PostgreSQL DB, which itself has a limited amount of connections in its pool. Currently, if the job is restarted, the connections do not get closed properly and as a result the next instance of this job does not have enough connections available to operate correctly. If I would be able to catch the kill signal in the job somehow, I could still ensure the connections are closed eventually.

Solution

I suspect the best you can do is to register an atexit handler in your Python driver; whether it successfully gets called depends on the cause of the restart or failure, so you can only verify whether this seems to work by testing it with your intended restart case first.

Otherwise, if there's a way to force cleanup of orphaned connections somehow through other means, it might just be easier to look for them on startup and issue any necessary cleanup calls explicitly.