Search code examples
jsonjq

JQ - Using a non UTC date field and looking back a certain amount of time


I am trying to write a BASH+jq one liner to only return the entries that are less than 5 minutes old as I am writing automation to only look back 5 minutes (BASH). Using the following JSON:

[
  {
    "id": 621024,
    "iid": 99999,
    "project_id": 999,
    "sha": "faf897sd98fa987afsd98f7",
    "ref": "master",
    "status": "running",
    "source": "push",
    "created_at": "2024-05-10T16:32:01.072-04:00",
    "updated_at": "2024-05-10T16:32:03.565-04:00",
    "web_url": "https://gitlab/pipelines/9999",
    "name": null
  },
  {
    "id": 621023,
    "iid": 99999,
    "project_id": 999,
    "sha": "faf897sd98fa987afsd98f7",
    "ref": "master",
    "status": "success",
    "source": "push",
    "created_at": "2024-05-10T16:31:47.951-04:00",
    "updated_at": "2024-05-10T16:35:30.362-04:00",
    "web_url": "https://gitlab/pipelines/9999",
    "name": null
  },
  {
    "id": 621020,
    "iid": 99999,
    "project_id": 999,
    "sha": "faf897sd98fa987afsd98f7",
    "ref": "master",
    "status": "waiting_for_resource",
    "source": "push",
    "created_at": "2024-05-10T15:27:00.248-04:00",
    "updated_at": "2024-05-10T15:30:14.172-04:00",
    "web_url": "https://gitlab/pipelines/9999",
    "name": null
  },
  {
    "id": 621019,
    "iid": 99999,
    "project_id": 999,
    "sha": "faf897sd98fa987afsd98f7",
    "ref": "master",
    "status": "success",
    "source": "push",
    "created_at": "2024-05-10T14:26:02.235-04:00",
    "updated_at": "2024-05-10T14:29:45.406-04:00",
    "web_url": "https://gitlab/pipelines/9999",
    "name": null
  }
]

My challenge is that the date being returned for created_at is in EST

I have been trying to use some of the traditional time functions/tools with jq but dont think its working because they expect the date to check against to be in UTC

I know I can take off the milliseconds using something like

jq '.[] | .created_at |= .[0:19]'

But that still leaves me with a timestamp that is in EST.

"created_at": "2024-05-10T15:27:00"

All the documentation I read talks about needing the date to be in ISO 8601 to be parsable with some of date/time functions. I dont know what to add to my jq that would subtract only selecting the entries that are less than 5 minutes old

This query fails:

jq '.[] | .created_at |= .[0:19] | select(.created_at | fromdate - 300)'

with error

jq: error (at <stdin>:55): date "2024-05-10T16:32:01" does not match format "%Y-%m-%dT%H:%M:%SZ"


Solution

  • Either use the Go implementation of jq, which can parse %z correctly:

    gojq '
      map(select(
        .created_at | sub("\\.[0-9]+"; "")
        | strptime("%FT%T%z") | now - mktime <= 300
      ))
    '
    

    Or manually interpret the (numeric) timezone (as (+|-)HH[:]MM), and do the math:

    map(select(
      .created_at | sub("\\.[0-9]+"; "")
      | ( capture(".{19}(?<d>[+-])(?<h>[0-9]{2}):?(?<m>[0-9]{2})")
          | .d += "1" | .[] |= tonumber
        ) as $tz
      | strptime("%FT%T%z") | .[3] -= $tz.d * $tz.h | .[4] -= $tz.d * $tz.m
      | now - mktime <= 300
    ))