Search code examples
pythonasynchronousgeventstrace

What could I grep for in strace output from a gevent app to test whether it's using blocking network IO?


I have a small gevent server that also is an http client of other apis. One library I use has httplib as a dependency.

I do monkey.patch_all. There's no disk IO, just network.

I'm rusty on my system calls, but I was wondering if there'd be a simple way to test for whether blocking calls are being made using strace.

I know how to use strace but am not sure exactly what I might grep for.

Would this work?

Thanks


Solution

  • Went ahead an experimented a bit myself and think I found at least a partial answer.

    I made a throwaway webapp that simply sleeps 5 seconds then responds 'hello'.

    Then I wrote two clients. One uses python requests, the other uses grequests (an asynchronous requests module).

    First client:

    import requests
    
    resp = requests.get('http://localhost:1234')
    

    Second client:

    from gevent import monkey; monkey.patch_all()
    import grequests
    
    req = grequests.get('http://localhost:1234')
    resp = grequests.map([req])[0]
    

    I ran these as:

    strace -T -f -s1000  python blocking_io.py &> out
    

    In the output of the (expected) blocking version, I get:

    sendto(3, "GET / HTTP/1.1\r\nHost: localhost:1234\r\nConnection: keep-alive\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nUser-Agent: python-requests/2.9.1\r\n\r\n", 144, 0, NULL, 0) = 144 <0.000022>
    fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR) <0.000008>
    fcntl(3, F_SETFL, O_RDWR)               = 0 <0.000007>
    recvfrom(3, "HTTP/1.0 200 OK\r\nContent-Type: text/html; charset=utf-8\r\nContent-Length: 12\r\n", 8192, 0, NULL, NULL) = 77 <5.009440>
    recvfrom(3, "Server: Werkzeug/0.11.5 Python/2.7.10\r\nDate: Tue, 05 Apr 2016 22:28:38 GMT\r\n\r\nHello World!", 8192, 0, NULL, NULL) = 90 <0.000067>
    close(3)                                = 0 <0.000175>
    

    Note the 5 second wait time at the end of the line for the first recvfrom.

    Then I did it for the async version:

    [pid 24359] sendto(5, "GET / HTTP/1.1\r\nHost: localhost:1234\r\nConnection: keep-alive\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nUser-Agent: python-requests/2.9.1\r\n\r\n", 144, 0, NULL, 0) = 144 <0.000023>
    [pid 24359] fcntl(5, F_GETFL)           = 0x802 (flags O_RDWR|O_NONBLOCK) <0.000008>
    [pid 24359] fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008>
    [pid 24359] recvfrom(5, 0x2481624, 8192, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000009>
    [pid 24359] epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=8589934597}}) = 0 <0.000008>
    [pid 24359] epoll_wait(3, {{EPOLLIN, {u32=5, u64=8589934597}}}, 64, 59743) = 1 <5.009130>
    [pid 24359] recvfrom(5, "HTTP/1.0 200 OK\r\nContent-Type: text/html; charset=utf-8\r\nContent-Length: 12\r\nServer: Werkzeug/0.11.5 Python/2.7.10\r\nDate: Tue, 05 Apr 2016 22:29:50 GMT\r\n\r\nHello World!", 8192, 0, NULL, NULL) = 167 <0.000114>
    [pid 24359] close(5)                    = 0 <0.000185>
    

    In this case the 5 seconds is spend doing epoll_wait.

    Additionally, where the blocking version does

    fcntl(3, F_SETFL, O_RDWR)
    

    The asynch version does

    fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK)