I need to rewrite my simple code. I'm getting simple strings as below:
I'm getting to .split() all words after "Distrib " and I've to fulfill following conditions:
If string[0] is text && string[1] is as integer then join only these to get as result "ABC/1"
If string[0] is text && string[1] is text join only them and then get as result "ABC/DEF"
If string[0] is text && string[1] is text && string[2] is text join them all and get as result: "ABC/DEF/GHI"
I wrote a simple code to do this but I'm really interested how to write it less complex and more readable ;)
import re
def main_execute():
#input_text = "Distrib ABC 1-2-x"
#input_text = "Distrib ABC DEF 1-2-x"
#input_text = "Distrib ABC DEF GHI 1-2-x"
print(str(input_text))
load_data = re.search('\s[A-Z]*.[A-Z]*.[A-Z]+ [0-9]', input_text).group()
print("Pobrany ciąg znaków: " + load_data)
words_array = load_data.split()
if re.match('[0-9]', words_array[1]):
print("Złożony ciąg: "
+ words_array[0]
+ "/"
+ words_array[1])
elif re.match('[A-Z]', words_array[0]) and re.match('[A-Z]', words_array[1]) and re.match('[0-9]', words_array[2]):
print("Złożony ciąg: "
+ words_array[0]
+ "/"
+ words_array[1])
elif re.match('[A-Z]', words_array[0]) and re.match('[A-Z]', words_array[1]) and re.match('[A-Z]', words_array[2]) and re.match('[0-9]', words_array[3]):
print("Złożony ciąg: "
+ words_array[0]
+ "/"
+ words_array[1]
+ "/"
+ words_array[2])
if __name__ == "__main__":
main_execute()
This can be vastly simplified to
import re
data = """
Distrib ABC 1-2-x
Distrib ABC DEF 1-2-x
Distrib ABC DEF GHI 1-2-x
"""
rx = re.compile(r'Distrib (\w+) (\w+)\s*((?:(?!\d)\w)+)?')
results = ["/".join([n for n in m.groups() if n]) for m in rx.finditer(data)]
print(results)
Which yields
['ABC/1', 'ABC/DEF', 'ABC/DEF/GHI']
See a demo for the expression on regex101.com.
Distrib (\w+) (\w+)\s*([^\W\d]+)?
The part [^\W\d]+
is saying: not not (the doubling is no mistake!) word characters, not digits, as long as possible.