Search code examples
pythonjsonexcelfileexport-to-excel

Convert JSON into XLSX of all .json files in directory


Faced with the problem of reading all json files which are located in the same directory as the executable file of the program, selecting certain elements in these files and saving them in one Excel file.

For example, I have a lot of JSON files like:

First file

{"General":{"Code":"AAIC-PB","Type":"Preferred Share","Name":"Arlington Asset Investment Corp"}, "Highlights":{"MarketCapitalization":211528800,"MarketCapitalizationMln":211.5288}}

Second file

{"General":{"Code":"A","Type":"Common Stock","Name":"Agilent Technologies"}, "Highlights":{"MarketCapitalization":567456,"MarketCapitalizationMln":222.567}}

I wanna take specific key and its' value and put it into excel. For example take CODE from General and MarketCapitalization from Highlights so it must be smth like this in the end:

enter image description here

I am new to python, so I ask for help from those who knows. Thank you!


Solution

  • This should work for you, use glob to find all the json files in a directory, then for each file use json to open those files and combine the "Highlights" and "General" keys and add all the resulting dictionaries to a list, then save to csv

    import csv
    import json
    from glob import glob
    
    # function to open json files
    def read_json(path):
        with open(path, 'r') as file:
            return json.load(file)
    
    # function to save csv files
    def write_csv(data, path):
        with open(path, 'w') as file:
            # get all the keys
            fieldnames = set().union(*data)
            writer = csv.DictWriter(file, fieldnames=fieldnames, lineterminator='\n')
            writer.writeheader()
            writer.writerows(data)
    
    # use glob to find all the *.json files in the folder named json_dir
    json_files = glob('./json_dir/*.json')
    
    rows = []
    
    for json_file in json_files:
        # read the json file
        json_data = read_json(json_file)
        # combine General and Highlights into the same dictionary
        rows.append({**json_data['General'], **json_data['Highlights']})
    
    # write the csv to json_data.csv
    write_csv(rows, './json_data.csv')
    

    Output:

    enter image description here