I am new to python so I would like to get a few ideas for this. I am writing a function to find matching word patterns in a sentence and replace the spaces inside of only the matched words.
Input:
(c)variable < var_CONST1(例 -125(N)) 【AAA BBB有】AND【技術企画】AND【AAA BBB CCC】
Expected Output:
(c)variable < var_CONST1(例 -125(N)) 【AAA-BBB有】AND【技術企画】AND 【AAA-BBB-CCC】
In the sample, spaces inside "【AAA BBB有】" and "【AAA BBB CCC】" should be replaced with "-".
I created the code below which solves the problem. However, I would like to know if is a better/more elegant way of writing it.
import re
text = "(c)variable < var_CONST1(例 -125(N)) 【AAA BBB有】AND【技術企画】AND 【AAA BBB CCC】"
match_list = re.findall(r"【[\w\s]+】", text)
match_list = [w.replace(" ", "-") for w in match_list]
tmp_txt = re.sub(r"【[\w\s]+】", " tkn ", text).split()
new_txt = ""
for txt in tmp_txt:
if txt == "tkn":
new_txt = new_txt + " " + match_list[0]
match_list.pop(0)
else:
new_txt = new_txt + " " + txt
print(new_txt)
Thank you very much.
We can use re.sub
here with a callback function to target only spaces occurring inside 【...】
:
inp = "(c)variable < var_CONST1(例 -125(N)) 【AAA BBB有】AND【技術企画】AND【AAA BBB CCC】"
output = re.sub(r'【.*?】', lambda m: m.group().replace(' ', '-'), inp)
print(output)
This prints:
(c)variable < var_CONST1(例 -125(N)) 【AAA-BBB有】AND【技術企画】AND【AAA-BBB-CCC】