I would like to get the string 10M5D8P into a dictionary:
M:10, D:5, P:8 etc. ...
The string could be longer, but it's always a number followed by a single letter from this alphabet: MIDNSHP=X
As a first step I wanted to split the string with a lookbehind and lookahead, in both cases matching this regex: [0-9]+[MIDNSHP=X]
So my not working solution looks like this at the moment:
import re
re.compile("(?<=[0-9]+[MIDNSHP=X])(?=[0-9]+[MIDNSHP=X])").split("10M5D8P")
It gives me an error message that I do not understand: "look-behind requires fixed-width pattern"
You may use re.findall.
>>> import re
>>> s = "10M5D8P"
>>> {i[-1]:i[:-1] for i in re.findall(r'[0-9]+[MIDNSHP=X]', s)}
{'M': '10', 'P': '8', 'D': '5'}
>>> {i[-1]:int(i[:-1]) for i in re.findall(r'[0-9]+[MIDNSHP=X]', s)}
{'M': 10, 'P': 8, 'D': 5}
Your regex won't work because re
module won't support variable length lookbehind assertions. And also it won't support splitting on zero width boundary, so this (?<=\d)(?=[A-Z])
also can't be possible.