I'm getting results I can't explain when making a ZeroMQ REQ/REP
request with polling.
The elapsed time, taken by the poll()
-method, is influenced by a time.sleep()
elsewhere in the code.
I have a server process which runs a command, and returns the pid
of the process. This is the request side code which has the issue:
import datetime, sys, time, zmq sleeptime = float(sys.argv[1]) def log_msg(msg): print "{0}, {1}".format(datetime.datetime.utcnow().strftime('%Y%m%d_%H:%M:%S.%f'), msg) context = zmq.Context() socket = context.socket(zmq.REQ) socket.connect("tcp://{0}:{1}".format('myserver', 9980)) poller = zmq.Poller() poller.register(socket, zmq.POLLIN) req = { 'req_type':'raw_cmd', 'cmd':'echo hello', 'block':0, 'timeout':300, 'return_output': 0} for i in range(4): time.sleep(sleeptime) socket.send_json(req) start = time.time() socks = dict((poller.poll(30000))) elapsed = (time.time()-start)*1000 rep = socket.recv_json() log_msg('pid={0}, sleep={1}, time taken={2}'.format(rep['pid'], sleeptime, int(elapsed)))
The first poll elapsed time is quite variable, but all the subsequent ones take about 2 seconds less the sleep time, unless sleep time is 0, in which case it's fast, so:
sleep = 0.5
:
20201008_08:27:24.168800, pid=52528, sleep=0.5, time taken=505 20201008_08:27:26.210196, pid=52529, sleep=0.5, time taken=1540 20201008_08:27:28.250891, pid=52530, sleep=0.5, time taken=1539 20201008_08:27:30.295036, pid=52531, sleep=0.5, time taken=1543
sleep = 1.5
:
20201008_08:44:02.474492, pid=54730, sleep=1.5, time taken=295 20201008_08:44:04.516844, pid=54731, sleep=1.5, time taken=540 20201008_08:44:06.557980, pid=54732, sleep=1.5, time taken=539 20201008_08:44:08.599717, pid=54733, sleep=1.5, time taken=539
sleep = 0
:
20201008_08:27:13.999147, pid=52513, sleep=0.0, time taken=690 20201008_08:27:14.033915, pid=52514, sleep=0.0, time taken=34 20201008_08:27:14.068803, pid=52515, sleep=0.0, time taken=34 20201008_08:27:14.103947, pid=52516, sleep=0.0, time taken=34
So question is, why does the time.sleep()
affect the time taken by the ZeroMQ poller.poll()
?
The code as-is mixes apples with oranges:
time.time()
returns seconds since epoch as float
, i.e. ( time.time() - start )*1000
yields how many [ms]
milliseconds the code-section under test actually took.
Whereas the poller.poll()
-method expects milliseconds for it's timeout
, so the requested 30000 [ms]
makes the poller
-instance wait and hold until 30 [s]
elapse or a message-arrival related zmq.POLLIN
-event permits to step forwards anywhere earlier.
This said, your measurement explores not the .sleep()
-method, but intervals taken at random by an external process, that sends REP
-side answers to REQ
-side .send()
-s.
Still in doubts? Set sleeptime to 31 [s]
and you ought receive almost zero durations reported from an ill-formulated local time-bracket (sure, if and only if the remote REP
-lying entity did not crash of suffocate during such ops).
You might also like zmq.Stopwatch
class, having { .start(), .stop() }
-methods, that provide a smart tool for using a ZeroMQ native [us]
-resolution-clock. Most of my posts on ZeroMQ performance use this for microbenchmarking.