Search code examples
pythonjsonjsondecoder

Python to parse file with two lists of json objects using json load


I have large json file with two lists of json objects.

example data:

data.json

[{"a":1}][{"b":2}]

parser.py

import json

message = json.load(open("data.json"))

for m in message:
    print m

As expected, I get ValueError.

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 290, in load
    **kw)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 369, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 10 - line 1 column 19 (char 9 - 18)

I thought of splitting the file by tracking the character count. What would be the pythonic way to handle this issue?


Solution

  • You could use json.JSONDecoder.raw_decode() which will parse one complete object and return it with the character position it ended at, allowing you to iterate through each one:

    from json import JSONDecoder, JSONDecodeError
    
    decoder = JSONDecoder()
    data = '[{"a":1}][{"b":2}]'
    
    pos = 0
    while True:
        try:
            o, pos = decoder.raw_decode(data, pos)
            print(o)
        except JSONDecodeError:
            break
    

    Result:

    [{'a': 1}]
    [{'b': 2}]