I've inherited the maintenance of some scientific computing using Parallel Python on a cluster. With Parallel Python, jobs are submitted to a ppserver, which (in this case) talks to already-running ppserver processes on other computers, dishing tasks out to ppworkers processes.
I'd like to use the standard library logging module to log errors and debugging information from the functions that get submitted to a ppserver. Since these ppworkers run as separate processes (on separate computers) I'm not sure how to properly structure the logging. Must I log to a separate file for each process? Maybe there's a log handler that would make it all better?
Also, I want reports on what process on what computer has hit an error, but the code I'm writing the logging in probably isn't aware of these things; maybe that should be happening at the ppserver level?
(Version of the question cross-posted on Parallel Python Forums, I'll post an answer here if I get something there about this from a non SO user)
One way to solve your problem is to do the following:
logging.handlers.SocketHandler
to send events from the worker to a dedicated logger process.If you catch exceptions in your worker functions and log them, then you should be able to get visibility of errors across all workers in one place.