Search code examples
pythonjsonsplit

Split JSON file into multiple files using Python(notebook)


I am using Python (notebook) I have a JSON file with this structure store in json_out var:

[
{"any": 2023},
{
"dia": 24,
"mes": 1,
"any": 2023,
"mes_referencia": 12,
"any_referencia": 2022,
"calendari_nom": "CCC"
},
{
"dia": 4,
"mes": 12,
"any": 2023,
"mes_referencia": 10,
"any_referencia": 2023,
"calendari_nom": "FFF"
},
{
"dia": 4,
"mes": 1,
"any": 2023,
"mes_referencia": 0,
"any_referencia": 2022,
"calendari_nom": "GAS",
"periode_ref": "TT"
},
{
"dia": 3,
"mes": 10,
"any": 2023,
"mes_referencia": 0,
"any_referencia": 2023,
"calendari_nom": "GAS",
"periode_ref": "22"
}
]

I need to split the file into multiple files, each for each { } block. I am a little confused on how to do this. I have tried this but it doesn't work to me:

import json

with (json_out, 'r') as f_in:
    data = json.load(f_in)

    for i, fact in enumerate(data[], 1):

        with open('data_out_{}.json'.format(i), 'w') as f_out:
            d = {}
            json.dump(d, f_out, indent=4)

Can you give any idea on how to do it?

Thanks.


Solution

  • Here is an example how you can create the files from the provided Json file:

    import json
    
    with open("data.json", "r") as f_in:
        data = json.load(f_in)
    
        for i, d in enumerate(data, 1):
            with open(f"data_out_{i}.json", "w") as f_out:
                json.dump(d, f_out, indent=4)
    

    For example data_out_2.json will contain:

    {
        "dia": 24,
        "mes": 1,
        "any": 2023,
        "mes_referencia": 12,
        "any_referencia": 2022,
        "calendari_nom": "CCC"
    }
    

    EDIT: If json_output contains string with JSON encoded data:

    import json
    
    json_output = """\
    [
    {"any": 2023},
    {
    "dia": 24,
    "mes": 1,
    "any": 2023,
    "mes_referencia": 12,
    "any_referencia": 2022,
    "calendari_nom": "CCC"
    },
    {
    "dia": 4,
    "mes": 12,
    "any": 2023,
    "mes_referencia": 10,
    "any_referencia": 2023,
    "calendari_nom": "FFF"
    },
    {
    "dia": 4,
    "mes": 1,
    "any": 2023,
    "mes_referencia": 0,
    "any_referencia": 2022,
    "calendari_nom": "GAS",
    "periode_ref": "TT"
    },
    {
    "dia": 3,
    "mes": 10,
    "any": 2023,
    "mes_referencia": 0,
    "any_referencia": 2023,
    "calendari_nom": "GAS",
    "periode_ref": "22"
    }
    ]"""
    
    data = json.loads(json_output)
    
    for i, d in enumerate(data, 1):
        with open(f"data_out_{i}.json", "w") as f_out:
            json.dump(d, f_out, indent=4)