I'm having trouble parsing the body of a request using jsonlines. I'm using tornado as the server and this is happening inside a post() method. My purpose in this is to parse the request's body into separate JSONs, then iterate over them with a jsonlines Reader, do some work on each one and then push them to a DB. I solved this problem by dumping the utf-8 encoded body into a file and then used:
with jsonlines.open("temp.txt") as reader:
That works for me. I can iterate over the entire file with
for obj in reader:
I just feel like this is an unnecessary overhead that can be reduced if I can understand what's keeping me from just using this bit of code instead:
log = self.request.body.decode("utf-8")
with jsonlines.Reader(log) as reader:
for obj in reader:
the exception I get is this:
jsonlines.jsonlines.InvalidLineError: line contains invalid json: Expecting property name enclosed in double quotes: line 1 column 2 (char 1) (line 1)
I've tried searching for this error here and all I found were examples where people tried using incorrectly formatted jsons that have one quote instead of double quotes. That is not the case for me. I debugged the request and saw that the string that returns from the decode method indeed has double quotes for both properties and values.
here is a sample of the body of the request I send (this is what it looks like in Postman):
{"type":"event","timestamp":"2018-03-25 09:19:50.999","event":"ButtonClicked","params":{"screen":"MainScreen","button":"SettingsButton"}}
{"type":"event","timestamp":"2018-03-25 09:19:51.061","event":"ScreenShown","params":{"name":"SettingsScreen"}}
{"type":"event","timestamp":"2018-03-25 09:19:53.580","event":"ButtonClicked","params":{"screen":"SettingsScreen","button":"MissionsButton"}}
{"type":"event","timestamp":"2018-03-25 09:19:53.615","event":"ScreenShown","params":{"name":"MissionsScreen"}}
You can reproduce the exception by using this simple bit of code in a post method and sending the lines I provided through Postman:
log = self.request.body.decode("utf-8")
with jsonlines.Reader(log) as currentlog:
for obj in currentlog:
print("obj")
As a sidenote: Postman sends the data as text, not JSON.
If you need any more information to answer this question, please let me know. One thing I did notice is that the string that returns from the decode method starts and ends with one quote. I guess this is because of the double quotes in the JSONs themselves. Is it related in any way? An example:
'{"type":"event","timestamp":"2018-03-25 09:19:50.999","event":"ButtonClicked","params":{"screen":"MainScreen","button":"SettingsButton"}}'
Thanks for any help!
jsonlines.Reader accepts iterable as an arg ("The first argument must be an iterable that yields JSON encoded strings" not json-encoded single string as in your example), but, after .decode("utf-8")
, log would be a string, which happen to support iterable interface. So when reader calls under the hood next(log)
it will get first item of a log string, i.e. character {
and will try to process it as an json-line which would be obviously invalid. Try log = log.split()
before passing log to the Reader.