python multithreading queue producer-consumer

why doesn't a simple python producer/consumer multi-threading program speed up by adding the number of workers?

The code below is almost identical to the python official Queue example at http://docs.python.org/2/library/queue.html

from Queue import Queue
from threading import Thread
from time import time
import sys

num_worker_threads = int(sys.argv[1])
source = xrange(10000)

def do_work(item):
    for i in xrange(100000):
        pass

def worker():
    while True:
        item = q.get()
        do_work(item)
        q.task_done()

q = Queue()

for item in source:
    q.put(item)

start = time()

for i in range(num_worker_threads):
    t = Thread(target=worker)
    t.daemon = True
    t.start()

q.join()

end = time()

print(end - start)

These are the results on a Xeon 12-core processor:

$ ./speed.py 1
12.0873839855

$ ./speed.py 2
15.9101941586

$ ./speed.py 4
27.5713479519

I expected that increasing the number of workers reduce the response time but instead, it is increasing. I did the experiment again and again but the result didn't change.

Am I missing something obvious? or the python queue/threading doesn't work well?

Solution

Python is rather poor at multi-threading. Due to a global lock only one thread normally makes progress at a time. See http://wiki.python.org/moin/GlobalInterpreterLock