I'm trying to write a Python library to parse our version format strings. The (simplified) version string format is as follows:
<product>-<x>.<y>.<z>[-alpha|beta|rc[.<n>]][.<extra>]][.centos|redhat|win][.snb|ivb]
This is:
foo
0.1.0
beta
, rc.1
, alpha.extrainfo
centos
snb
, ivb
So the following are valid version strings:
1) foo-1.2.3
2) foo-2.3.4-alpha
3) foo-3.4.5-rc.2
4) foo-4.5.6-rc.2.extra
5) withos-5.6.7.centos
6) osandextra-7.8.9-rc.extra.redhat
7) all-4.4.4-rc.1.extra.centos.ivb
For all of those examples, the following regex works fine:
^(?P<prod>\w+)-(?P<maj>\d).(?P<min>\d).(?P<bug>\d)(?:-(?P<pre>alpha|beta|rc)(?:\.(?P<pre_n>\d))?(?:\.(?P<pre_x>\w+))?)?(?:\.(?P<os>centos|redhat|win))?(?:\.(?P<plat>snb|ivb))?$
But the problem comes in versions of this type (no 'extra' pre-release information, but with os and/or platform):
8) issue-0.1.0-beta.redhat.snb
With the above regex, for string #8, redhat
is being picked up in the pre-release extra info pre_x
, instead of the os
group.
I tried using look-behind to avoid picking the os or platform strings in pre_x
:
...(?:\.(?P<pre_x>\w+))?(?<!centos|redhat|win|ivb|snb))...
That is:
^(?P<prod>\w+)-(?P<maj>\d).(?P<min>\d).(?P<bug>\d)(?:-(?P<pre>alpha|beta|rc)(?:\.(?P<pre_n>\d))?(?:\.(?P<pre_x>\w+))?(?<!centos|redhat|win|ivb|snb))?(?:\.(?P<os>centos|redhat|win))?(?:\.(?P<plat>snb|ivb))?$
This would work fine if Python's standard module re
could accept variable width look behind. I would rather try to stick to the standard module, rather than using regex as my library is quite likely to be distributed to a large number machines, where I want to limit dependencies.
I've also had a look at similar questions: this, this and this are not aplicable.
Any ideas on how to achieve this?
My regex101 link: https://regex101.com/r/bH0qI7/3
[For those interested, this is the full regex I'm actually using: https://regex101.com/r/lX7nI6/2]
You need to use negative lookahead assertion to make (?P<pre_x>\w+)
to match any except for centos
or redhat
.
^(?P<prod>\w+)-(?P<maj>\d)\.(?P<min>\d)\.(?P<bug>\d)(?:-(?P<pre>alpha|beta|rc)(?:\.(?P<pre_n>\d))?(?:\.(?:(?!centos|redhat)\w)+)?)?(?:\.(?P<os>centos|redhat))?(?:\.(?P<plat>snb|ivb))?$