Search code examples
jsonpython-3.xlistdictionarymerge

Merge multiple JSON files (more than two)


I would like to merge multiple JSON files into one file. All of those files have the same structure. For example I've created three files which would look like this:

ExampleFile_1

    {
      "items": [
        {
          "answers": [
            {
              "creation_date": 1538172165
            },
            {
              "creation_date": 1538172205
            },
            {
              "creation_date": 1538172245
            }
         ],
       "creation_date": 1538172012,
       "question_id": 52563137
       }
      ]
    }

ExampleFile_2

    {
      "items": [
        {
          "answers": [
            {
              "creation_date": 1538326991
            }
          ],
        "creation_date": 1538172095,
        "question_id": 52563147
        },
        {
          "answers": [
            {
              "creation_date": 1538180453
            }
          ],
        "creation_date": 1538172112,
        "question_id": 52563150
        }
      ]
    }

ExampleFile_3

   {
       "items": [
          {
            "answers": [
              {
                 "creation_date": 1538326991
              }
            ],
              "creation_date": 1538172095,
              "question_id": 52563147
           }
        ]
     }

Now I would like to merge all three files inside the "items" list into one file which then would like this:

merged_json.json

   {
       "items": [
        {
         "answers": [
            {
              "creation_date": 1538172165
            },
            {
              "creation_date": 1538172205
            },
            {
              "creation_date": 1538172245
            }
          ],
            "creation_date": 1538172012,
            "question_id": 52563137
          },
          {
            "answers": [
             {
               "creation_date": 1538326991
             }
            ],
           "creation_date": 1538172095,
           "question_id": 52563147
          },
          {
           "answers": [
             {
               "creation_date": 1538180453
             }
            ],
            "creation_date": 1538172112,
            "question_id": 52563150
          },
          {
            "answers": [
              {
                 "creation_date": 1538326991
              }
            ],
            "creation_date": 1538172095,
            "question_id": 52563147
           }
        ]
     }

So like above the "items" should be concatenated.

I already tried to come up with a solution but could not figure it out. This is what I got so far:

read_files = glob.glob("ExampleFile*.json")
output_list = []

for f in read_files:
    with open(f, "rb") as infile:
        output_list.append(json.load(infile))

all_items = []
for json_file in output_list:
    all_items += json_file['items']

textfile_merged = open('merged_json.json', 'w')
textfile_merged.write(str(all_items))
textfile_merged.close()

This, unfortunately, leaves me with a messed up json file which only consists of the dicts inside "items".

How do I create such a file like merged_json.json?

Thanks in advance.


Solution

  • You're using the json module to convert the JSON file into Python objects, but you're not using the module to convert those Python objects back into JSON. Instead of this at the end

    textfile_merged.write(str(all_items))
    

    try this:

    json.dump({ "items": all_items }, textfile_merged)
    

    (Note that this is also wrapping the all_items array in a dictionary so that you get the output you expect, otherwise the output will be a JSON array, not an object with an "items" key).