Search code examples
pythonjsondatabasejsoniq

Remove duplicates in an object in JSONiq


This is an example object that I got:

{ "query1" : [ { "name" : "John", "id" : 1234 }, { "name" : "Rose", "id" : 3214 }, { "name" : "John", "id" : 1234 } ] }

How can I remove the duplicates using group by and array navigation / unboxing?

I tried implementing the group by clause after the where clause but did not get the correct answer


Solution

  • In JSONiq, you can indeed remove duplicates with a group by and array unboxing, like so:

    let $data := {
      "query1" : [
        { "name" : "John", "id" : 1234 },
        { "name" : "Rose", "id" : 3214 },
        { "name" : "John", "id" : 1234 }
      ]
    }
    return {
     "query1" : [
        for $obj in $data.query1[]
        group by $n := $obj.name, $i := $obj.id
        return $obj[1]
      ]
    }
    

    There is also a generic approach that will work even with unknown fields and more nested values:

    let $data := {
      "query1" : [
        { "name" : "John", "id" : 1234 },
        { "name" : "Rose", "id" : 3214 },
        { "name" : "John", "id" : 1234 }
      ]
    }
    return {
      "query1" : [
        for $obj at $i in $data.query1[]
        where
          every $other in $data.query1[][position() lt $i]
          satisfies not deep-equal($obj, $other)
        return $obj
      ]
    }