I am programming something in Java and I need to "normalize" the URIs, meaning, treat a URI as unique regardless of the query parameter values for timestamp, portalId, timeout, app version, etc.
Here's my regex pattern: (?<=/)[0-9]+
It works for the following URI: https://app.url.com/user/1234567
However, it doesn't work for the URI below. Is it possible to have one Regex pattern to accommodate both scenarios?
The digits in the example seem to be after the /
or the =
as well as the version=
What you might do is matching 1 or more digits asserting either a /
or =
to the left, but not for example version=
to the left.
(?<=[/=])(?<!version=)\d+
The pattern matches:
(?<=[/=])
Positive lookbehind, assert either /
or +
directly to the left(?<!version=)
Negative lookbehind, assert not version=
directly to the left\d+
Match 1+ digits