Search code examples
pythonjsonpython-3.xpython-requestsjsondecoder

Python requests.get return error json.decoder.JSONDecodeError


I am trying to read bulk data from server of which I have no control over.

Error:

json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 1794)

Error Image here

This .json() call throws error json.decoder.JSONDecodeError json.decoder.

import requests
Data = requests.get(Data_Url ,headers=session["headers"]).json()
print(Data) 

This .text returns data in a string which I can't manipulate.

import requests
Data = requests.get(Data_Url ,headers=session["headers"]).text
print(Data) 

As shown below the format of data seems to be

{}
{}
{}

How can manipulate request.get response so that I have JSON format and separated by {},{}?

{ "Data" : [{},{}]}
{"resourceType":"Person","id":"cg3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"3g3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"pg3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"GA3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"zQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"qQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"Fw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"Nw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"hw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
{"resourceType":"Person","id":"DSQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}

Solution

  • The results you're getting from .text aren't in a valid JSON (or Python literal) format. After studying the results, I determined that each line in the string returned is missing the characters "}]}]}" at the end that would correct that problem.

    The code below adds them to each line, and then parses/evaluated it using the ast.literal_eval() function to turn it into a Python dictionary. A list comprehension is then utilized to put them into a list. In other words, you don't have to bother nesting them inside a dictionary like the {"Data": [{}, {}, ...]} you proposed (unless you really want to for some unknown reason).

    from ast import literal_eval
    import json
    from pprint import pprint
    
    requests_get_text = """\
    {"resourceType":"Person","id":"cg3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    {"resourceType":"Person","id":"3g3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    {"resourceType":"Person","id":"pg3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    {"resourceType":"Person","id":"GA3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    {"resourceType":"Person","id":"zQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    {"resourceType":"Person","id":"qQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    {"resourceType":"Person","id":"Fw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    {"resourceType":"Person","id":"Nw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    {"resourceType":"Person","id":"hw3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    {"resourceType":"Person","id":"DSQ3","extension":[{"extension":[{"valueCoding":{"system":"http://terminology.hl7.org/","code":"UNK","display":"Unknown"}
    """
    
    # Convert result from requests.get().text into valid JSON/Python format.
    data = [literal_eval(f'{line}' '}]}]}\n')
                        for line in requests_get_text.splitlines()]
    pprint(data, sort_dicts=False)
    

    Output:

    [{'resourceType': 'Person',
      'id': 'cg3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]},
     {'resourceType': 'Person',
      'id': '3g3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]},
     {'resourceType': 'Person',
      'id': 'pg3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]},
     {'resourceType': 'Person',
      'id': 'GA3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]},
     {'resourceType': 'Person',
      'id': 'zQ3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]},
     {'resourceType': 'Person',
      'id': 'qQ3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]},
     {'resourceType': 'Person',
      'id': 'Fw3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]},
     {'resourceType': 'Person',
      'id': 'Nw3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]},
     {'resourceType': 'Person',
      'id': 'hw3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]},
     {'resourceType': 'Person',
      'id': 'DSQ3',
      'extension': [{'extension': [{'valueCoding': {'system': 'http://terminology.hl7.org/',
                                                    'code': 'UNK',
                                                    'display': 'Unknown'}}]}]}]