I'm developing a larger shared library in C++ and recently exchanged the complete database accessing classes to e.g. pool usage etc.
Before the change, a certain db-related task caused (according to perf etc.) an expected amount of CPU load in the lib (on application server) as well as CPU load on the DB server and took about 45 minutes.
Now after exchanging the database access classes, the task still takes much time (~40minutes), but CPU usage on application server as well as on DB server is VERY low.
So my expectation is, that indeed I optimized the code (freeing up MUCH CPU load on both systems), but the overall time did not decrease at all. Network usage of app server also expectedly stayed the same.
There probably is some blocking code like semaphores (e.g.grabbing a connection from pool) or sleep, but I fail to find a way to profile RUNtime instead of CPUtime and thus getting a hint wheather I have a disk issue (e.g. DB writing to disk), a network issue (senseless reconnections to DB), a (b)locking issue (e.g. multithreaded usage of pool.grab), something in the MQ area or whatever.
Can someone give me a hint how to profile this and find the time waster?
The DB system should be by far fat enough :-) but connected via Internet tunnel the rest is at the moment running in a single multithreaded process on app server.
sometimes things are quite easy ... one of the first hints in Mike's post was to just pause the thing in debugger and look where the threads are currently and after searching through over 100 subtreads, I discovered a "hidden" sleep from development time, that was accidentally appended to another line with several tabs inbetween causing it to be far right of my viewing window. Thanks, Mike, for this link :-)