I have problem with thread limiting. I want to do it using QThread. So SpiderThread is QThread object crawling some urls. But I want to limit working threads to X threads at once. I have done it earlier with threadpool and QRunnable but it's buggy in pyside when numbers of urls are big. So I have this simple code:
self.threads = []
for url in self.urls:
th = SpiderThread(url)
th.updateresultsSignal.connect(self.update_results)
self.threads.append(th)
th.start()
Anyone have working example of limiting threads using QThread ?
So you want to have at most X threads running at any given time? So how about a URL queue shared by 10 threads:
self.threads = []
queueu = Queue(self.urls) # replace with a sync queue
for i in xrange(1,10):
th = SpiderThread(queue)
th.updateresultsSignal.connect(self.update_results)
self.threads.append(th)
th.start()
Then in the run of each thread, the thread gets a URL off the queue (so removes it from queue), and when it is done processing the URL, it gets a new one. In pseudocode:
class SpiderThread(Thread):
def __init__(self, queue):
self.queue = queue
def run(self):
while not self.queue.empty():
maxWait = 100 # miliseconds
try:
url = self.queue.get(true, maxWait)
process(url)
except Queue.Empty:
break # no more URLs, work completed!