Search code examples
pythonwsgiwebob

How to construct a webob.Request or a WSGI 'environ' dict from raw HTTP request byte stream?


Suppose I have a byte stream with the following in it:

POST /mum/ble?q=huh
Content-Length: 18
Content-Type: application/json; charset="utf-8"
Host: localhost:80

["do", "re", "mi"]

Is there a way to produce an WSGI-style 'environ' dict from it?

Hopefully, I've overlooked an easy answer, and it is as easy to achieve as the opposite operation. Consider:

>>> import json
>>> from webob import Request
>>> r = Request.blank('/mum/ble?q=huh')
>>> r.method = 'POST'
>>> r.content_type = 'application/json'
>>> r.charset = 'utf-8'
>>> r.body = json.dumps(['do', 're', 'mi'])
>>> print str(r) # Request's __str__ method gives raw HTTP bytes back!
POST /mum/ble?q=huh
Content-Length: 18
Content-Type: application/json; charset="utf-8"
Host: localhost:80

["do", "re", "mi"]

Solution

  • Reusing Python's standard library code for the purpose is a bit tricky (it was not designed to be reused that way!-), but should be doable, e.g:

    import cStringIO
    from wsgiref import simple_server, util
    
    input_string = """POST /mum/ble?q=huh HTTP/1.0
    Content-Length: 18
    Content-Type: application/json; charset="utf-8"
    Host: localhost:80
    
    ["do", "re", "mi"]
    """
    
    class FakeHandler(simple_server.WSGIRequestHandler):
        def __init__(self, rfile):
            self.rfile = rfile
            self.wfile = cStringIO.StringIO() # for error msgs
            self.server = self
            self.base_environ = {}
            self.client_address = ['?', 80]
            self.raw_requestline = self.rfile.readline()
            self.parse_request()
    
        def getenv(self):
            env = self.get_environ()
            util.setup_testing_defaults(env)
            env['wsgi.input'] = self.rfile
            return env
    
    handler = FakeHandler(rfile=cStringIO.StringIO(input_string))
    wsgi_env = handler.getenv()
    
    print wsgi_env
    

    Basically, we need to subclass the request handler to fake out the construction process that's normally performed for it by the server (rfile and wfile built from the socket to the client, and so on). This isn't quite complete, I think, but should be close and I hope it proves helpful!

    Note that I've also fixed your example HTTP request: without an HTTP/1.0 or 1.1 at the end of the raw request line, a POST is considered ill-formed and causes an exception and a resulting error message on handler.wfile.