I have json files in below format, some are empty and some have records:
{"result": []}
or
{"result": [{"approval": "Approved", "opened_at": "02/06/2023 18:49:18", "state": "Work in Progress", "sys_id": "73a6167b1b83a51"}, {"approval": "Approved", "opened_at": "07/11/2023 13:19:38", "state": "Open", "sys_id": "4bfd299371d4b41"}, .......]}
If the json has 'sys_id' column, then it should print sys_id available, else print empty json
I tried this, but each time I only get output as empty list, even if the sys_id is avaiable.
import json
import pandas as pd
from pandas.io.json import json_normalize
spark.conf.set("spark.sql.legacy.json.allowEmptyString.enabled", True)
file "/dbfs/mnt/landing/2023/sample.json"
with open(file) as f:
data = json.load(f)
res = pd.json_normalize(data)
if 'sys_id' in data:
print("sys_id available")
else:
print("empty list")
It seems like the issue in your code is that you're checking if the string 'sys_id' is in the JSON data, which is not what you want. Instead, you should check if the key 'sys_id' is present in the JSON object
import json
import pandas as pd
with open('/dbfs/mnt/landing/2023/sample.json') as f:
data = json.load(f)
# Check if 'result' is a non-empty list
if 'result' in data and isinstance(data['result'], list) and len(data['result']) > 0:
for item in data['result']:
if 'sys_id' in item:
print(f"sys_id available: {item['sys_id']}")
else:
print("empty list")