Search code examples
pythonregexpython-3.xstringstring-matching

How to separate the prefix in words that are 'di'?


I want to separate some prefixes that are integrated into words after the word "di" is followed by letters.

sentence1 = "dipermudah diperlancar"
sentence2 = "di permudah di perlancar"

I expect the output like this:

output1 = "di permudah di perlancar"
output2 = "di permudah di perlancar"

Demo


Solution

  • This expression might work to some extent:

    (di)(\S+)
    

    if our data would just look like as simple as is in the question. Otherwise, we would be adding more boundaries to our expression.

    Test

    import re    
    regex = r"(di)(\S+)"    
    test_str = "dipermudah diperlancar"    
    subst = "\\1 \\2"    
    
    print(re.sub(regex, subst, test_str))
    

    The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.