My JSON file is 5.5 MB (annotations file of Object365 dataset for object detection models). My Python program can't even read it as a simple text file.
def ob365_converter(inputJsonFile, datasetPath):
text = readFileLine(inputJsonFile)
print("text:")
print(text[0:100])
datasetJson = json.loads(text)
print("dataset loaded")
for item in datasetJson["annotations"]:
#Do some operations
. . .
def readFileLine(filePath):
p = Path(filePath)
if not p.is_file():
print("%s is not a file", filePath)
return ""
with open(filePath, "r") as f:
text = f.readline()
return text
The output doesn't show even the first message "text:". I also tried with same result the following:
print ("A")
f = open(inputJsonFile, 'r')
datasetJson = json.load(f)
f.close()
print ("B")
How to handle huge JSON files in Python?
The big size of json file requires too much source to be handled. As mentioned in @cizario's link, it should be used some stream logic that access json objects without storing all the content of the file.
One py library that works in streaming can be found at https://www.npmjs.com/package/stream-json