I want to separate some prefixes that are integrated into words after the word "di" is followed by letters.
sentence1 = "dipermudah diperlancar"
sentence2 = "di permudah di perlancar"
I expect the output like this:
output1 = "di permudah di perlancar"
output2 = "di permudah di perlancar"
This expression might work to some extent:
(di)(\S+)
if our data would just look like as simple as is in the question. Otherwise, we would be adding more boundaries to our expression.
import re
regex = r"(di)(\S+)"
test_str = "dipermudah diperlancar"
subst = "\\1 \\2"
print(re.sub(regex, subst, test_str))
The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.