The title pretty much says it.
I tried a variety of things, including but not limited to:
>>> re.findall(r'(/+)([^/]*)', '///a//b/c///d')
[('///', 'a'), ('//', 'b'), ('/', 'c'), ('///', 'd')]
And:
>>> re.findall('(/+[^/]*)', '///a//b/c///d')
['///a', '//b', '/c', '///d']
What I want is something like:
>>> re.findall(something, '///a//b/c///d')
['/', 'a/b', 'c/', 'd']
...or close to that. Note that this example is of a relative path, because the // at the beginning is a single slash comprising the entire first folder name.
We have something working using string.split('/') and list operations, but we want to explore regex-based solutions.
Thanks!
Assuming that escaping has precedence over splitting (i.e. '///' = '/' + separator), you could do it like this :
p = '///a//b/c///d'
import re # this is not the ideal tool for this kind of thing
# pattern splits '/' when it is preceded by '//' (escaped '/')
# or when it is not preceded by another '/'
# in both cases the '/' must not be followed by another '/'
pattern = r"((?<=\/\/)|(?<!\/))(?!.\/)\/"
# replace the separators by an end of line then split on it
# after unescaping the '//'
path = re.sub(pattern,"\n",p).replace("//","/").split("\n")
# or split and unescape (exclude empty parts generated by re.split)
path = [s.replace("//","/") for s in re.split(pattern,p) if s]
print(path) # ['/', 'a/b', 'c/', 'd']
However a non-re solution will probably be more manageable:
path = [s.replace("\0","/") for s in p.replace("//","\0").split("/")]
# or
path = p.replace("//","\0").replace("/","\n").replace("\0","/").split("\n")
print(path) # ['/', 'a/b', 'c/', 'd']
Note: to obtain ["c//","d"]
you would need the source to be encoded as "c/////d"