This is the input json. In this example json the key/value... "foo:bar" keeps repeating randomly. Order is not important eventhough it looks to be repeating alternately.
[
{
"foo": "bar",
"id": "baz"
},
{
"thud": "grunt",
"id": "fum"
},
{
"foo": "bar",
"id": "noot"
},
{
"zot": "toto",
"id": "pluto"
},
{
"foo": "bar",
"id": "toto"
}
]
Whenever a key/value gets repeated, rather than removing it, would want to add an additional key/value into that particular element as shown below The desired output would be:
[
{
"foo": "bar",
"id": "baz"
},
{
"thud": "grunt",
"id": "fum"
},
{
"foo": "bar",
"id": "noot",
"desc": "1st duplicate found
},
{
"zot": "toto",
"id": "pluto"
},
{
"foo": "bar",
"id": "toto",
"desc": "2nd duplicate found"
}
]
Again order and numbering is not relevant/required. Added it for articulation purposes only
Found several solution to remove duplicates but unable to make any headway to resolve this
Appreciate any proposed resolution for above
Thanks much for you time
Tried complex solution to split the json into two and merge with -n and argjson without much break through
Here's one approach using tostream
and fromstream
to deconstruct and reconstruct the input via stream representation, which is a stream of arrays containing a path and its corresponding value. A foreach
loop iterates over this streams, replicating each item for later reconstruction. Additionally, it keeps track of each path-value pair reduced by the path's first item (matches occur irrelevant of their position in the original input array), and registers each appearance using a counter. If that is higher than one, also output another item (distinguished by adding _dup
to the last path item) with the current count as value.
fromstream(
foreach (tostream | [., (.[0] |= .[1:] | @json)]) as [$s,$j] (
{};
if $s | has(1) then .[$j] += 1 end;
if .[$j] > 1 then [($s[0] | last += "_dup"), .[$j]] else empty end,
$s
)
)
[
{
"foo": "bar",
"id": "baz"
},
{
"thud": "grunt",
"id": "fum"
},
{
"foo_dup": 2,
"foo": "bar",
"id": "noot"
},
{
"zot": "toto",
"id": "pluto"
},
{
"foo_dup": 3,
"foo": "bar",
"id": "toto"
}
]