Search code examples
pythonregexrubysubstitutionregex-greedy

Regex to fix (all the matches or none) at the end to one


I'm trying to fix the . at the end to only one in a string. For example,

line = "python...is...fun..."

I have the regex \.*$ in Ruby, which is to be replaced by a single ., as in this demo, which don't seem to work as expected. I've searched for similar posts, and the closest I'd got is this answer in Python, which suggests the following,

>>> text1 = 'python...is...fun...'
>>> new_text = re.sub(r"\.+$", ".", text1)
>>> 'python...is...fun.'

But, it fails if I've no . at the end. So, I've tried like \b\.*$, as seen here, but this fails on the 3rd test which has some ?'s at end.

My question is, why \.*$ not matches all the .'s (despite of being greedy) and how to do the problem correctly?


Expected output:

python...is...fun.
python...is...fun.
python...is...fun??.

Solution

  • You might use an alternation matching either 2 or more dots or assert that what is directly to the left is not one of for example ! ? or a dot itself.

    In the replacement use a single dot.

    (?:\.{2,}|(?<!\.))$
    

    Explanation

    • (?: Non capture group for the alternation
      • \.{2,} Match 2 or more dots
      • | Or
      • (?<!\.) Get the position where directly to the left is not a . (which you can extend with other characters as desired)
    • ) Close non capture group
    • $ End of string (Or use \Z if there can be no newline following)

    Regex demo | Python demo

    For example

    import re 
    strings = [
        "python...is...fun...",
        "python...is...fun",
        "python...is...fun??"
    ]
    
    for s in strings:
        new_text = re.sub(r"(?:\.{2,}|(?<!\.))$", ".", s)
        print(new_text)
    

    Output

    python...is...fun.
    python...is...fun.
    python...is...fun??.
    

    If an empty string should not be replaced by a dot, you can use a positive lookbehind.

    (?:\.{2,}|(?<=[^\s.]))$
    

    Regex demo