I have a long-running Python process that is generating more data than I planned for. My results are stored in a list that will be serialized (pickled) and written to disk when the program completes -- if it gets that far. But at this rate, it's more likely that the list will exhaust all 1+ GB free RAM and the process will crash, losing all my results in the process.
I plan to modify my script to write results to disk periodically, but I'd like to save the results of the currently-running process if possible. Is there some way I can grab an in-memory data structure from a running process and write it to disk?
I found code.interact(), but since I don't have this hook in my code already, it doesn't seem useful to me (Method to peek at a Python program running right now).
I'm running Python 2.5 on Fedora 8. Any thoughts?
Thanks a lot.
Shahin
There is not much you can do for a running program. The only thing I can think of is to attach the gdb debugger, stop the process and examine the memory. Alternatively make sure that your system is set up to save core dumps then kill the process with kill --sigsegv <pid>
. You should then be able to open the core dump with gdb and examine it at your leisure.
There are some gdb macros that will let you examine python data structures and execute python code from within gdb, but for these to work you need to have compiled python with debug symbols enabled and I doubt that is your case. Creating a core dump first then recompiling python with symbols will NOT work, since all the addresses will have changed from the values in the dump.
Here are some links for introspecting python from gdb:
http://wiki.python.org/moin/DebuggingWithGdb
http://chrismiles.livejournal.com/20226.html
or google for 'python gdb'
N.B. to set linux to create coredumps use the ulimit command.
ulimit -a
will show you what the current limits are set to.
ulimit -c unlimited
will enable core dumps of any size.