Search code examples
jsonbashcsvjqratecard-api

Parse JSON to CSV using jq but split sub list in multiple records


I am parsing a JSON result which i get from the Azure RateAPI and want to convert it into a standard CSV file.

The following line is what i am using to convert it into CSV and it works but as one of the attributes is a list, it does not provide me the result i am seeking. For every item in the "sub list", i would need to create another record in my csv file.

cat myfile.json | jq -r '.Meters[] | [ .EffectiveDate, .IncludedQuantity, .MeterCategory, .MeterId, .MeterName, .MeterRates[], .MeterRegion, .MeterStatus, .MeterSubCategory, .MeterTags[], .Units] | @csv'

Here are 3 records I am trying to parse. I am having trouble with record 2 because MeterRates is actually the list where i need both, the attribute and the value. I would need record 2, once parsed, to correspond to 3 records in the CSV file where each record contains one item of the list in the MeterRates. An example of expected result is at the end

  "OfferTerms": [],
  "Meters": [
    {
      "EffectiveDate": "2019-03-01T00:00:00Z",
      "IncludedQuantity": 0,
      "MeterCategory": "Virtual Machines",
      "MeterId": "d0bf9053-17c4-4fec-8502-4eb8376343a7",
      "MeterName": "F2/F2s Low Priority",
      "MeterRates": {
        "0": 0.0766
      },
      "MeterRegion": "US West 2",
      "MeterStatus": "Active",
      "MeterSubCategory": "F/FS Series Windows",
      "MeterTags": [],
      "Unit": "1 Hour"
    },
    {
      "EffectiveDate": "2014-11-01T00:00:00Z",
      "IncludedQuantity": 0,
      "MeterCategory": "Azure DevOps",
      "MeterId": "c4d6fa88-0df9-4680-867a-b13c960a875f",
      "MeterName": "Virtual User Minute",
      "MeterRates": {
        "0": 0.0004,
        "1980000": 0.0002,
        "9980000": 0.0001
      },
      "MeterRegion": "",
      "MeterStatus": "Active",
      "MeterSubCategory": "Cloud-Based Load Testing",
      "MeterTags": [],
      "Unit": "1/Month"
    },
    {
      "EffectiveDate": "2017-04-01T00:00:00Z",
      "IncludedQuantity": 0,
      "MeterCategory": "SQL Database",
      "MeterId": "cb770eab-d5c8-45fd-ac56-8c35069f5a29",
      "MeterName": "P4 DTUs",
      "MeterRates": {
        "0": 68.64
      },
      "MeterRegion": "IN West",
      "MeterStatus": "Active",
      "MeterSubCategory": "Single Premium",
      "MeterTags": [],
      "Unit": "1/Day"
    }
    ]
}

Actual results using the code i provided is the following:

"2019-03-01T00:00:00Z",0,"Virtual Machines","d0bf9053-17c4-4fec-8502-4eb8376343a7","F2/F2s Low Priority",0.0766,"US West 2","Active","F/FS Series Windows",
"2014-11-01T00:00:00Z",0,"Azure DevOps","c4d6fa88-0df9-4680-867a-b13c960a875f","Virtual User Minute",0.0004,0.0002,0.0001,"","Active","Cloud-Based Load Testing",
"2017-04-01T00:00:00Z",0,"SQL Database","cb770eab-d5c8-45fd-ac56-8c35069f5a29","P4 DTUs",68.64,"IN West","Active","Single Premium",

but the result i would expect is (record 2 to correspond to 3 records in the CSV file based on MeterRates):

"2019-03-01T00:00:00Z",0,"Virtual Machines","d0bf9053-17c4-4fec-8502-4eb8376343a7","F2/F2s Low Priority",0,0.0766,"US West 2","Active","F/FS Series Windows",
"2014-11-01T00:00:00Z",0,"Azure DevOps","c4d6fa88-0df9-4680-867a-b13c960a875f","Virtual User Minute",0,0.0004,"","Active","Cloud-Based Load Testing",
"2014-11-01T00:00:00Z",0,"Azure DevOps","c4d6fa88-0df9-4680-867a-b13c960a875f","Virtual User Minute",1980000,0.0002,"","Active","Cloud-Based Load Testing",
"2014-11-01T00:00:00Z",0,"Azure DevOps","c4d6fa88-0df9-4680-867a-b13c960a875f","Virtual User Minute",9980000,0.0001"","Active","Cloud-Based Load Testing",
"2017-04-01T00:00:00Z",0,"SQL Database","cb770eab-d5c8-45fd-ac56-8c35069f5a29","P4 DTUs",0,68.64,"IN West","Active","Single Premium",

Thank you for your help.


Solution

  • You'll want to add a step between getting Meters items and outputting rows, to output the various combinations of Meters items with different rates. As you have it right now, you're outputting the rates as other items for the row which isn't really what you want.

    In this case, you could just add a new property to hold the value of the corresponding MeterRate.

    .Meters[] | .MeterRate = (.MeterRates | to_entries[])
        | [.EffectiveDate, .IncludedQuantity, .MeterCategory, .MeterId , .MeterName,
           .MeterRate.key, .MeterRate.value,
           .MeterRegion, .MeterStatus, .MeterSubCategory, .MeterTags[], .Units]
        | @csv
    

    You may want to consider doing something similar for the MeterTags items so you don't end up with potentially random column counts.