I Have run into a few examples of managing threads with the threading module (using Python 2.6).
What I am trying to understand is how is this example calling the "run" method and where. I do not see it anywhere. The ThreadUrl class gets instantiated in the main() function as "t" and this is where I would normally expect the code to start the "run" method.
Maybe this is not the preferred way of working with threads? Please enlighten me:
#!/usr/bin/env python
import Queue
import time
import urllib2
import threading
import datetime
hosts = ["http://example.com/", "http://www.google.com"]
queue = Queue.Queue()
class ThreadUrl(threading.Thread):
"""Threaded Url Grab"""
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
#grabs host from queue
host = self.queue.get()
#grabs urls of hosts and prints first 1024 bytes of page
url = urllib2.urlopen(host)
print url.read(10)
#signals to queue job is done
self.queue.task_done()
start = time.time()
def main():
#spawn a pool of threads, and pass them queue instance
for i in range(1):
t = ThreadUrl(queue)
t.setDaemon(True)
t.start()
for host in hosts:
queue.put(host)
queue.join()
main()
print "Elapsed time: %s" % (time.time() - start)
Per the pydoc:
Thread.start()
Start the thread’s activity.
It must be called at most once per thread object. It arranges for the object’s run() method to be invoked in a separate thread of control.
This method will raise a RuntimeException if called more than once on the same thread object.
The way to think of python Thread
objects is that they take some chunk of python code that is written synchronously (either in the run
method or via the target
argument) and wrap it up in C code that knows how to make it run asynchronously. The beauty of this is that you get to treat start
like an opaque method: you don't have any business overriding it unless you're rewriting the class in C, but you get to treat run
very concretely. This can be useful if, for example, you want to test your thread's logic synchronously. All you need is to call t.run()
and it will execute just as any other method would.