Suppose I have the following string, representing one or many days or day ranges:
mon,thu..fri,sun
How can I match any arbitrary list of ranges or single days with a regular expression, without expanding the day alternatives twice?
I currently have this:
(?P<weekdays>
(
\b
(mon|tue|wed|thu|fri|sat|sun)
(\.\.(mon|tue|wed|thu|fri|sat|sun))?
,?
)*
)
... this works, but it forces me to repeat the day alternatives in the regex (which are simplified here but are longer!). Note that this regex matches for fri,sat,
thus optionally ending in a comma, this IS the desired behavior.
I also tried making the range portion a limited repetition using {1,2}
, but I am unable to avoid matching the invalid mon..tue..fri
because the pattern restarts via the optional comma.
Note that this is part of a longer regex so I can't use the global flag.
This is the Regex101 URL, where I also added some unit tests.
Small edit: used the \b metacharacter instead of a negative lookahead.
You can use PCRE named group and reuse a sub-pattern later using (?&groupName)
construct:
^(?<weekdays>
(
\b
(?<weeks>mon|tue|wed|thu|fri|sat|sun)
(?:\.\.(?&weeks))?
,?
)+
)$
To keep definition separate from reference, use DEFINE
directive of PCRE:
(?(DEFINE)
(?<weeks>mon|tue|wed|thu|fri|sat|sun)
)
^(?P<weekdays>
(?:
\b
(?&weeks)
(?:\.\.(?&weeks))?
,?
)*
)$