Sorry for this rather specific use case.
I have a sequence of documents that all have a timestamp field:
2021-12-15T03:06:04Z
2021-12-15T03:06:14Z
2021-12-15T03:06:24Z
2021-12-15T03:06:34Z
2021-12-15T03:06:44Z
2021-12-15T03:07:04Z
2021-12-15T03:17:04Z
My aim is to identify which documents are all within 1 minute of eachother, and delete all documents except one of those (so we only have 1 document per any 60 second interval). Which document is kept is not important.
Is there any dateTime functions or xfunct functions I could leverage to tackle this elegantly? A big caveat of the data is that all timestamps are random, there are no patterns to the timestamps coming back we could build off of. There is also around 1k documents that this needs to be run on every 3 hours, so performance is also an issue.
Thank you in advance to anyone who replies.
You can group them by the dateTime formatted with minute precision, use that as the key for a map, and just perform a put with those values. At the end, there will only be one entry per minute.
let $dates := ("2021-12-15T03:06:04Z",
"2021-12-15T03:06:14Z",
"2021-12-15T03:06:24Z",
"2021-12-15T03:06:34Z",
"2021-12-15T03:06:44Z",
"2021-12-15T03:07:04Z",
"2021-12-15T03:17:04Z")!xs:dateTime(.)
let $dates-by-minute := map:map()
let $_group :=
for $date in $dates
let $key := fn:format-dateTime($date, "[Y01]/[M01]/[D01] [H01]:[m01]")
return map:put($dates-by-minute, $key, $date)
return
map:keys($dates-by-minute) ! map:get($dates-by-minute, .)