Search code examples
pythonjsondictionaryamazon-s3boto3

how to save/upload my result as json file into the sub folders of S3 bucket via python


I have a dictionary as:

My_dict is: 

{'folder_1/folder_2/folder_3/file_1.csv': 'value_1',
 'folder_1/folder_4/folder_5/file_2.csv': 'value_2'}

I would like to add a json file to each folder in S3 which has the file_1.csv and file_2.csv so that the file name will be called value.json including the full path and the corresponding 'value' (i.e {'folder_1/folder_2/folder_3/file_1.csv': 'value_1'}).

I would like to have a value.json file in the S3 folders:

  1. folder_1/folder_2/folder_3
  2. folder_1/folder_4/folder_5

and each folder should have the 'value.json' as

  1. value.json: {'folder_1/folder_2/folder_3/file_1.csv':'value_1'}
  2. value.json: {'folder_1/folder_2/folder_5/file_2.csv': 'value_2'}

I am new to AWS S3 and I have tried:

    for file, h_val in My_dict.items():
        s3.put_object(
            Bucket=bucket_name, Body=json.dumps(My_dict), Key=f"{file.replace('.csv','')}.json"
        )

but this saves same file in each folder and with both results. However, I would like to have the json file in each folder by just having its own file name and the corresponding value. Any help would be appreciated.


Solution

  • The following snippet uses os.path.dirname() to turn a filename like folder_1/folder_2/folder_3/file_1.csv into folder_1/folder_2/folder_3/value.json. It also creates a temporary dictionary value for the content of that particular value.json file.

    #!/usr/bin/env python3
    
    import os.path
    import json
    
    My_dict = {
        'folder_1/folder_2/folder_3/file_1.csv': 'value_1',
        'folder_1/folder_4/folder_5/file_2.csv': 'value_2'
    }
    
    for fname, h_val in My_dict.items():
        print("{} => {}/value.json".format(fname, os.path.dirname(fname)))
        value_json = {fname: h_val}
        print("  {}".format(json.dumps(value_json)))
        s3.put_object(
            Bucket=bucket_name,
            Body=json.dumps(value_json),
            Key="{}/value.json".format(os.path.dirname(fname))
        )
    

    The print calls are for clarification only, of course. This answer will only work if, in practice, there is only ever one occurrence of the sub-folders like folder_1/folder_2/folder_3 in My_dict. If there are multiple, with file_X.csv entries, only the last value.json will be the one that remains.