Search code examples
jsonuniquejq

Remove duplicate objects in JSON array with jq based on specific criteria


I have the following data where I want entire objects removed based on duplicate values of the "run" key, while keeping the object with the largest "startTime" number:

{
  "data": {
    "results": [
      {
        "event": {
          "biking": {
            "startTime": 12,
            "id": "a",
            "run": "x"
          }
        },
        "displayName": "Alex"
      },
      {
        "event": {
          "biking": {
            "startTime": 10,
            "id": "b",
            "run": "x"
          }
        },
        "displayName": "Adam"
      },
      {
        "event": {
          "biking": {
            "startTime": 11,
            "id": "c",
            "run": "y"
          }
        },
        "displayName": "Aaron"
      }
    ]
  }
}

I've been trying to finagle unique with jq but can't quite get what I want. My intended result is this:

{
  "data": {
    "results": [
      {
        "event": {
          "biking": {
            "startTime": 12,
            "id": "a",
            "run": "x"
          }
        },
        "displayName": "Alex"
      },
      {
        "event": {
          "biking": {
            "startTime": 11,
            "id": "c",
            "run": "y"
          }
        },
        "displayName": "Aaron"
      }
    ]
  }
}

I was trying to use unique because I want to keep only 1 of each "run": ids where in a larger list I might have three x, two y, and four z. I'd want to keep one x, y, and z in this case based on the largest "startTime".


Solution

  • Here's a straightforward jq solution:

    .data.results |=
      (group_by(.event.biking.run)
       | map(max_by(.event.biking.startTime)))
    

    It uses group_by to group by "run", and then max_by to select the desired event.