Search code examples

Complete a json string from incomplete HTTP JSON response

Sometimes I will download data from a json api, and it cuts off mid-way, usually due to network timeout or some other issues. However, in such scenarios I would like to be able to read the available data. Here is an example:

    "response": 200,
    "message": None,
    "params": []
    "body": {
        "timestamp": 1546033192,
        "_d": [
                {"id": "FMfcgxwBTsWRDsWDqgqRtZlLMdpCpTDz"},
                {"id": "FMfcgxwBTkFSKqRrcKzMFvLCjDSSbrJH"},
                {"id": "Fmfgo9

I would like to be able to "complete the string" so that I'm able to parse the incomplete response as json. For example:

s = '''
    "response": 200,
    "message": null,
    "params": [],
    "body": {
        "timestamp": 1546033192,
        "_d": [
                {"id": "FMfcgxwBTsWRDsWDqgqRtZlLMdpCpTDz"},
                {"id": "FMfcgxwBTkFSKqRrcKzMFvLCjDSSbrJH"}
{'response': 200, 'message': None, 'params': [], 'body': {'timestamp': 1546033192, '_d': [{'id': 'FMfcgxwBTsWRDsWDqgqRtZlLMdpCpTDz'}, {'id': 'FMfcgxwBTkFSKqRrcKzMFvLCjDSSbrJH'}]}}

How would I be able to do the above with an arbitrarily constructed json object such as the above?


  • Here is the way I did it, building a stack of } and ] characters to try and 'finish off'. It's a bit verbose and can be cleaned up, but it works on a few string inputs I've tried:

    "response": 200,
    "message": null,
    "params": [],
    "body": {
        "timestamp": 1546033192,
        "_d": [
                {"id": "FMfcgxwBTsWRDsWDqgqRtZlLMdpCpTDz"},
                {"id": "FMfcgxwBTkFSKqRrcKzMFvLCjDSSbrJH"},
                {"id": "Fmfgo9'''
    >>> f.complete_json_structure(s)
    {'response': 200, 'message': None, 'params': [], 'body': {'timestamp': 1546033192, '_d': [{'id': 'FMfcgxwBTsWRDsWDqgqRtZlLMdpCpTDz'}, {'id': 'FMfcgxwBTkFSKqRrcKzMFvLCjDSSbrJH'}]}}

    Here is the code:

    # Build the 'unfinished character' stack
    unfinished = []
    for char in file_data:
        if char in ['{', '[']:
        elif char in ['}', ']']:
            inverse_char = '{' if char == '}' else '['
            # Remove the last one
    # Build the 'closing occurrence string' 
    unfinished = ['}' if (char == '{') else ']' for char in unfinished]
    unfinished_str = ''.join(unfinished)
    # Do a while loop to try and parse the json
    data = None
    while True:
        if not json_string:
            raise FileParserError("Could not parse the JSON file or infer its format.")
        if json_string[-1] in ('}', ']'):
                data = json.loads(json_string + unfinished_str)
            except json.decoder.JSONDecodeError:
                # do it a second time as a sort of hack to fix the "trailing comma issue" (or could do a remove last comma, but that gets tricky)
                    data = json.loads(json_string + unfinished_str[1:])
                except json.decoder.JSONDecodeError:
            if data is not None:
        if json_string[-1] == unfinished_str[0]:
            unfinished_str = unfinished_str[1:]
        json_string = json_string[:-1].strip().rstrip(',')
    return data