Search code examples
pythonjsonlarge-files

Is there a memory efficient and fast way to load big JSON files?


I have some json files with 500MB. If I use the "trivial" json.load() to load its content all at once, it will consume a lot of memory.

Is there a way to read partially the file? If it was a text, line delimited file, I would be able to iterate over the lines. I am looking for analogy to it.


Solution

  • Update

    See the other answers for advice.

    Original answer from 2010, now outdated

    Short answer: no.

    Properly dividing a json file would take intimate knowledge of the json object graph to get right.

    However, if you have this knowledge, then you could implement a file-like object that wraps the json file and spits out proper chunks.

    For instance, if you know that your json file is a single array of objects, you could create a generator that wraps the json file and returns chunks of the array.

    You would have to do some string content parsing to get the chunking of the json file right.

    I don't know what generates your json content. If possible, I would consider generating a number of managable files, instead of one huge file.