Why isn't my multithreading helping speedup?

I am reading lines of text into a list 'rows', and trying to use multithreading to speedup. However, it doesn't speedup at all. I am watching the cpu usage on my Mac, and noticed that cpu is 145% with multithreading, but no speedup.

from concurrent.futures import ThreadPoolExecutor

te = TimeExtractor()
def time_test(text):
    result = te.compute_time(text)
    # print(result)

if __name__ == "__main__":

    start = time.time()
    rows = []
    with open('data/data.csv', 'r', encoding='utf') as f:
        csvreader = csv.DictReader(f, delimiter='\t', quoting=csv.QUOTE_ALL)
        for row in csvreader:
            rows.append(row['text'])

    with ThreadPoolExecutor(4) as executor:
        results = executor.map(time_test, rows)

    end = time.time()

    print(end-start)
    print('Done!!!')

Solution

A simplified version of your code, using Multi-threading, is:

import time
import concurrent.futures
from multiprocessing import cpu_count

num_cpu = cpu_count()
print("CPU Count: ", num_cpu)  # cpu_count doesnt really matter
e = concurrent.futures.ThreadPoolExecutor(num_cpu)


def cpu_intensive_task(i):
    print(i, ' : start task')
    count = 10000000 * (i+1)
    while count > 0:
        count -= 1
    print(i, ' : end task')
    return i


start = time.time()
for i in e.map(cpu_intensive_task, range(10)):
    print(i, ' : in loop')
end = time.time()

print('LOOP DONE')
print('Total Time taken: ', (end-start))

Output:

CPU Count:  8
0  : start task
1  : start task
2  : start task
3  : start task
4  : start task
5  : start task
7  : start task
6  : start task
0  : end task
8  : start task
0  : in loop
1  : end task
9  : start task
1  : in loop
2  : end task
2  : in loop
3  : end task
3  : in loop
4  : end task
4  : in loop
5  : end task
5  : in loop
6  : end task
6  : in loop
7  : end task
7  : in loop
8  : end task
8  : in loop
9  : end task
9  : in loop
LOOP DONE
Total Time taken:  30.59025502204895

Note: Loop is exited only after all the threads are done

Same code without Multi-threading:

import time


def cpu_intensive_task(i):
    print(i, ' : start task')
    count = 10000000 * (i+1)
    while count > 0:
        count -= 1
    print(i, ' : end task')
    return i


start = time.time()
for i in range(10):
    cpu_intensive_task(i)
    print(i, ' : in loop')
end = time.time()

print('LOOP DONE')
print('Time taken: ', (end-start))

Output:

0  : start task
0  : end task
0  : in loop
1  : start task
1  : end task
1  : in loop
2  : start task
2  : end task
2  : in loop
3  : start task
3  : end task
3  : in loop
4  : start task
4  : end task
4  : in loop
5  : start task
5  : end task
5  : in loop
6  : start task
6  : end task
6  : in loop
7  : start task
7  : end task
7  : in loop
8  : start task
8  : end task
8  : in loop
9  : start task
9  : end task
9  : in loop
LOOP DONE
Time taken:  30.072215795516968

Note: Time taken is almost same as Multi-threading approach (slightly lesser). Multi-threading doesnt help this type of work load

Same code using multiprocessing:

import time
import sys
from multiprocessing import Process, Lock, Value, cpu_count


def cpu_intensive_task(i):
    print(i, ' : start task')
    count = 10000000 * (i+1)
    while count > 0:
        count -= 1
    print(i, ' : end task')
    return i


if __name__ == '__main__':
    print("CPU Count: ", cpu_count())
    start = time.time()
    processes = []
    for i in range(10):
        p = Process(target=cpu_intensive_task, args=(i,))
        processes.append(p)
        p.start()
        print(i, ' : in loop')

    print('LOOP END')
    for p in processes:
        p.join()
    end = time.time()
    print('Total Time Taken: ', (end - start))

Output:

CPU Count:  8
0  : in loop
1  : in loop
2  : in loop
3  : in loop
4  : in loop
5  : in loop
6  : in loop
7  : in loop
8  : in loop
9  : in loop
LOOP END
0  : start task
1  : start task
2  : start task
3  : start task
5  : start task
4  : start task
8  : start task
7  : start task
6  : start task
9  : start task
0  : end task
1  : end task
2  : end task
3  : end task
4  : end task
5  : end task
6  : end task
7  : end task
8  : end task
9  : end task
Total Time Taken:  10.335741996765137

Note: Multiprocessing takes just 1/3rd of the time taken for Multi-threading approach

Global Interpreter Lock:

The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. However, some extension modules, either standard or third-party, are designed so as to release the GIL when doing computationally-intensive tasks such as compression or hashing. Also, the GIL is always released when doing I/O.

Multiprocessing:

multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine.

So, only multiprocessing allows utilisation of multiple processors and hence true concurrency.