Search code examples
multithreadingperformancedelphivcltiming

Can code running in a background thread be faster than in the main VCL thread in Delphi?


If anybody has had a lot of experience timing code running on the main VCL thread vs a background thread, I'd like to get an opinion. I have some code that does some heavy string processing running in my Delphi 6 application on the main thread. Each time I run an operation, the time for each operation hovers around 50 ms on a single thread on my i5 Quad core. What makes me really suspicious is that the same code running on an old Pentium 4 that I have, shows the same time for the operation when usually I see code running about 4 times slower on the Pentium 4 than the Quad Core. I am beginning to wonder if the code might be consuming significantly less time than 50 ms but that there's something about the main VCL thread, perhaps Windows message handling or executing Windows API calls, that is creating an artificial "floor" for the operation. Note, an operation is triggered by an incoming request on a socket if that matters, but the time measurement does not take place until the data is fully received.

Before I undertake the work of moving all the code on to a background thread for testing, I am wondering if anyone has any general knowledge in this area? What have your experiences been with code running on and off the main VCL thread? Note, the timing measurements are being done when there is absolutely no user triggered activity going on during the tests.

I'm also wondering if raising the priority of the thread to just below real-time would do any good. I've never seen much improvement in my run times when experimenting with those flags.

-- roschler


Solution

  • Without simple source code to reproduce the issue, and how you are timing your threads, it will be difficult to understand what occurs in your software.

    Sounds definitively like either:

    • An Architecture issue - how are your threads defined?
    • A measurement issue - how are you timing your threads?
    • A typical scaling issue of both the memory manager and the RTL string-related implementation.

    About the latest point, consider this:

    • The current memory manager (FastMM4) is not scaling well on multi-core CPU; try with a per-thread memory manager, like our experimental SynScaleMM - note e.g. that the Free Pascal Compiler team has written a new scaling MM from scratch recently, to avoid such issue;
    • Try changing the string process implementation to avoid memory allocation (use static buffers), and string reference-counting (every string reference counting access produces a LOCK DEC/INC which do not scale so well on multi-code CPU - use per-thread char-level process, using e.g. PChar on static buffers instead of string).

    I'm sure that without string operations, you'll find that all threads are equivalent.

    In short: neither the current Delphi MM, neither the current string implementation scales well on multi-core CPU. You just found out a known issue of the current RTL. Read this SO question.