I have a use case where I have a list of Java maven semantic ranges trying to find the list of versions that I can exclude in my application.
Example:
SemVersion1 = "[,1.8.8.1)"
SemVersion2 = "[,1.8.8.2)"
SemVersion3 = "[,1.8.8.3)"
SemVersion4 = "[1.0.0, 1.4.4)"
SemVersion5 = "[,1.3.11),[,1.7.0), [,1.8.0)"
How to filter the various SemVersion
to only keep the one that includes all other versions. In this case, SemVersion3
will be picked up because it includes all the versions.
Thinking about something like this in python:
output = []
for a in SemVersions:
is_redundent = False
for b in a:
if affected_versions[a].issuperset(affected_versions[b]):
is_redundent = True
break
if not is_redundent:
output.append(b)
The problem is that I would need to inflate the SemVersions
to get the affected_versions
, is there an easier way to do this.
Steps:
',1.2.3.4'
and '1.0,2.3.4' (we need exact lower bounds for the last step)from packaging.version import Version
from functools import cmp_to_key
from collections import defaultdict
def adj_lengths(s, c=3):
"""Add missing .0 to a version up to a predefined count '.'-count.
Example: '9.0.2' with c == 3 -> '9.0.2.0'"""
cnt = s.count(".")
if c>cnt:
s = s + ".0"*(c-cnt)
return s
def split_mult(s):
"""Make versions more precise by replacing implicit range starts.
Uses adj_lengths to add missing .0 to equalize version lengths.
Handles splitting multiple given ranges as well.
Returns iterable.
Example: '[,1.3.11),[,1.7.0), [,1.8.0)'
-> (['0.0.0.0', '1.3.11.0'],['0.0.0.0', '1.7.0.0'], ['0.0.0.0', '1.8.0.0'])"""
s = s.replace("[,",f"[0,")
s1 = [ adj_lengths(b) for b in (t.strip("[()] ") for t in s.split(","))]
yield from [ s1[i:i+2] for i in range(0,len(s1),2)]
def make_version(l):
"""Transform text-list into Version-tuple."""
return ( Version(l[0]), Version(l[1]) )
Program:
vers = ["[,1.8.8.1)",
"[,1.8.8.2)",
"[,1.8.8.3)",
"[1.0.0, 1.4.4)",
"[,1.3.11),[,1.7.0), [,1.8.0)"]
# preprocessing to make them nicer
vers2 = []
for tmp in (split_mult(a) for a in vers) :
vers2.extend( (make_version(k) for k in tmp) )
print(vers2)
# bring into order
vers2.sort()
# use the lower bound as key - append alle upper bounds to a defaultdict(list)
d = defaultdict(list)
for fr,to in vers2:
d[fr].append(to)
# simplify lower bound:[list of upper bounds] to (lower bound, max(upper bound list values))
vers3 = [ (k,max(v)) for k,v in d.items()]
# eliminate range that lie inside the 1st elements range
for item in vers3[1:][:]:
if item[0] >= vers3[0][0] and item[1] <= vers3[0][1]:
vers3.remove(item)
print(vers3)
Output:
[(<Version('0.0.0.0')>, <Version('1.8.8.3')>)]
If you have more then one resulting range you would have to do the last step for every element - not just the first one, f.e. when your data in the last step is like:
[(<Version('0.0.0.0')>, <Version('1.8.8.3')>),
(<Version('1.0.0.0')>, <Version('2.8.8.3')>),
(<Version('1.2.0.0')>, <Version('1.8.8.3')>), ] # last elem inside before-last elem