Search code examples
jsonjq

Use the output of a jq command as args of another jq command to compare 2 json files


I want to filter one JSON file based on the contents of another JSON file.

I want to extract the IDs on the first JSON file, data.json with this jq:

jq '[.data[] | select(.attributes.closeReason == "Fraud")][:10] | [.[].relationships.customer.id]' data.json

Which yields something like:

[
  "621506",
  "624722",
  "631044",
  "633359",
  "699327",
  "710710",
  "711493",
  "713413",
  "713824",
  "713903"
]

I want to get those values and use them to match the IDs of the other JSON...

Something like:

jq --argjson ids "$ARGS.positional[]" '.data[] | select(.id | IN($ids[]))' accounts.json

But I get the error:

jq: invalid JSON text passed to --argjson

I've tried scavenging Stack Overflow, Phind.com, ChatGPT but I haven't found any solution yet.

Examples - JSON files:

1st JSON file (extract the customer Id)

[
  {
    "type": "depositAccount",
    "id": "991231",
    "attributes": {
      "name": "William Heckerd",
      "status": "Closed",
      "closeReason": "Fraud",
      "fraudReason": "ACHActivity",
      "updatedAt": "2023-02-24T19:41:30.224Z"
    },
    "relationships": {
      "customer": {
        "data": {
          "type": "customer",
          "id": "123456"
        }
      }
    }
  }
]

2nd JSON file - Based on the customer Id filter it out:

{
  "data": [
    {
      "type": "individualCustomer",
      "id": "567898765",
      "attributes": {
        "createdAt": "2023-09-10T08:32:07.921Z",
        "fullName": {
          "first": "Foo",
          "last": "Example"
        },
        "email": "[email protected]",
        "status": "Archived",
        "archiveReason": "FraudClientIdentified"
      }
    }
  ]
}



Solution

  • jq can already handle multiple input files by itself:

    jq '
    map(select(.attributes.closeReason == "Fraud") | .relationships.customer.data.id) as $ids
    | input
    | .data[]
    | select(.id | IN($ids[]))
    ' accounts.json data.json
    

    Alternatively, if you really want to invoke two instances of jq:

    jq \
    --argjson ids "$(jq 'map(select(.attributes.closeReason == "Fraud") | .relationships.customer.data.id)' accounts.json)" \
    '.data[] | select(.id | IN($ids[]))' \
    data.json
    

    or with an intermediate variable:

    ids="$(jq 'map(select(.attributes.closeReason == "Fraud") | .relationships.customer.data.id)' accounts.json)"
    jq \
    --argjson ids "$ids" \
    '.data[] | select(.id | IN($ids[]))' \
    data.json