Search code examples
pythonjsonrestcurlcherrypy

Reading json data line by line in Cherrypy


I have json data that is going to be coming to my server in the following format:

{"line":"one"}
{"line":"two"}
{"line":"three"}

While I realize that this is not valid json format I have no control on how this data is reaching me. I need to be able to read the data line by line

Now I have a very simple Cherrypy server setup to accept the POST request. Here is the function that handles the POST request:

class PostEvent(object):
    exposed = True
    def POST(self, **urlParams):
        cl = cherrypy.request.headers['Content-Length']
        raw_body = cherrypy.request.body.read(int(cl))
        lines = raw_body.splitlines()
        with open('log.txt', 'w') as f:
            for line in lines:
                f.write('%s\n' % line)

Then I simply issue the following curl command to test:

curl -i -k -H "Content-Type: application/json" -H "Accept: application/json" -X POST --data @test_data -u username http://test-url.com

Where the file test_data contains my json data in the format specified above. I get a 200 response, however, all of the data read from the file is on one line like below:

{"line":"one"}{"line":"two"}{"line":"three"}

It seems as if when cherrypy is reading the body it is ignoring line delimiters such as \n. How do I get cherrypy to read the request body as it is formatted? Or more specifically how can I read the request body line by line and not all at once?


Solution

  • I cannot imagine CherryPy mangling data like that.

    Your test to write out the newline count shows that it is much more probably that curl is not sending the data with newlines intact, and by the time your request handler has it all newlines have been stripped (so raw_body.splitlines() just returns [raw_body] resulting in one line being written).

    Make sure you POST with the --data-binary switch; -d defaults to ASCII and could very well be altering the newlines for you:

    -d, --data is the same as --data-ascii. To post data purely binary, you should instead use the --data-binary option.