I am prototyping a simple Drake simulation. I have some simple Python LeafSystem
s that implement controllers, and find that without these systems, my simulation can run at realtime; however, with these systems, my simulation runs much slower than realtime.
I don't think it's the math, but instead just the overhead of Python vs. C++.
For this code:
https://github.com/EricCousineau-TRI/repro/tree/2e3865a7aefe8adc19a6ff69e84025def03da7fd/drake_stuff/python_profiling
If I try to use Python's cProfile
and then use snakeviz
to visualize the results, I can see that my Python code seems slow, but I can't tell how it compares to the C++ Drake code that pydrake
is binding.
Without Python LeafSystem
s (--no_control
):
With the Python LeafSystem
:
My tracepoint is in main()
, but it does not appear in either of those.
How do I get better information about relative timing, without rolling my own timers?
I'm not sure if this is best answer, but I found this post: https://stackoverflow.com/a/61253170/7829525
py-spy
seems like an excellent tool for seeing relative performance information for Python code that involves CPython API extensions.
From my naive usage mentioned here:
https://github.com/benfred/py-spy/issues/531
https://github.com/EricCousineau-TRI/repro/tree/6048da3/drake_stuff/python_profiling
I can now see more information.
Looking at interactive SVG flamegraphs from py-spy
with default rate of 100 samples/sec:
Without Python LeafSystem
s (--no_control
):
With Python LeafSystem
s:
Per @nicho's suggestion below, using py-spy --native
can provide much better detail. Using Ctrl+F for .py:
, here's what it looks like:
Without Python LeafSystem
s (--no_control
):
With Python LeafSystem
s: