Search code examples
amazon-s3amazon-redshiftjqarray-merge

merging s3 manifest files using jq


I have multiple s3 manifest files each corresponding to a date for a given date range. I am looking to merge all of the manifest files to generate a single manifest file, thus allowing me to perform a single Redshift copy.

manifest file 1:

{
    "entries": [
        {
            "url": "DFA/20161001/394007-OMD-Coles/dcm_account394007_activity_20160930_20161001_050403_294198927.csv.gz"
        }
    ]
}

manifest file 2:

{
    "entries": [
        {
            "url": "DFA/20161002/394007-OMD-Coles/dcm_account394007_activity_20161001_20161002_054043_294865863.csv.gz"
        }
    ]
}

I am looking for an output like:-

{
    "entries": [
         {
            "url": "DFA/20161001/394007-OMD-Coles/dcm_account394007_activity_20160930_20161001_050403_294198927.csv.gz"
         },
         {
            "url": "DFA/20161002/394007-OMD-Coles/dcm_account394007_activity_20161001_20161002_054043_294865863.csv.gz"
         }
    ]
}

I did try

jq -s '.[]' "manifest_file1.json" "manifest_file2.json" 

and other suggestions posted in Stackoverflow but couldn't make it work.


Solution

  • Or, without resorting to reduce:

    $ jq -n '{entries: [inputs.entries[]]}' manifest_file_{1,2}.json
    {
      "entries": [
        {
          "url": "DFA/20161001/394007-OMD-Coles/dcm_account394007_activity_20160930_20161001_050403_294198927.csv.gz"
        },
        {
          "url": "DFA/20161002/394007-OMD-Coles/dcm_account394007_activity_20161001_20161002_054043_294865863.csv.gz"
        }
     ]
    }
    

    Note that inputs was introduced in jq version 1.5. If your jq does not have inputs, you can use jq -s as follows:

    $ jq -s '{entries: [.[].entries[]]}' manifest_file_{1,2}.json