Search code examples
multithreadingdownload

About multithreading download disadvantages


I have a question about multithreading download, as you know downloading using several threads improve performance of application, however there are some measures to respect: like the number of threads, the available bandwidth and some more, but I don't really understand, why the performance of application might be degraded by using many threads for example, or how can the bandwidth,quality of server affect the performance of multithreaded application? , what are the cases in which monothread download is faster than multithread?
Thanks for your replies.


Solution

  • I assume you're referring to download managers.

    First, I'm sceptical of how much "performance" benefit a download manager really provides. But more importantly, any benefit they do provide is not due to multi-threading. The performance constraint of a download is the bandwidth of the connection. And this is why I'm sceptical of the benefits:

    • A 1 Mbps connection will download at 1 Mbps.
    • Splitting the file into 4 segments means you download each segment at 256 Kbps and 4 * 256 Kbps = 1 Mbps.
    • You may get some improvement if a server throttles each download segment.
    • You may get a small benefit if one of the segments gets timed out: the others downloading mean your connection doesn't sit idle during the time-out wait.
    • You might also speed up a download by 'drowning out' anything else trying to use the connection. (Not that I'd really call this a benefit though.)

    The real benefit of a download manager is in automatically restarting downloads efficiently (i.e. not re-starting from scratch if possible).

    So what is the point of multi-threading?

    Let's first dispel a myth: Multi-threading does not speed anything up. If a routine requires X clock-cycles to run: it will take X clock-cycles; whether on 1 thread or many threads.

    What multi-threading does do: it allows tasks to run concurrently (at the same time).

    The ability to do different things at the same time means:

    • A slow task (combining various segments of a large download) can be done on a different thread without interfering with other threads that need to react quickly (such as the user interface).
    • Concurrent tasks can also use more available resources (multiple CPUs) more efficiently. Note (in answer to the last part of your question) if you only have one CPU then your threads are "time-sliced" by the operating system so it's not truly concurrent. But the time slices are very small, so previous benefit still applies.

    When is single-threaded faster than multi-threaded?

    Well, pretty much always in cases where CPU is not the bottle-neck. In the case of download: As mentioned before, the bottle-neck is the bandwidth between the two end-points of the connection. Many threads actually means you have to do more work (managing and coordinating the different threads).

    The most efficient approach for download is 2 threads: one for the UI, and the other for the download so that any pauses/dealys don't stall the user interface.

    However, more generally even when you have CPU intensive work that could theoretically benefit from multiple threads doing different work concurrently, it's very easy to make mistakes in implementation that actually slow down your application.

    • Ideally your multiple tasks should not share data. Because if they do, then you risk race-condition or concurrency bugs.
    • When they do have to share data, you need to synchronise the work in some way to avoid the above mentioned bugs. (There are many techniques to choose from depending on your needs and I won't go into detail here.)
    • However if your synchronisation is poorly planned you risk introducing a number of problems that can significantly slow down your application. These include:
      • Bottle-necking through a shared resource to make your multiple threads unable to run concurrently in any case.
      • High lock contention where task spend more time waiting than working.
      • Even deadlocking which can totally block some tasks.