Search code examples
pythonregexpython-re

Can't get regex patterns right


I made a function that replaces multiple instances of a single character with multiple patterns depending on the character location.

There were two ways I found to accomplish this:

  1. This one looks horrible but it works:

def xSubstitution(target_string):

while target_string.casefold().find('x') != -1:

    x_finded = target_string.casefold().find('x')

    if (x_finded == 0 and target_string[1] == ' ') or (target_string[x_finded-1] == ' ' and 
       ((target_string[-1] == 'x' or 'X') or target_string[x_finded+1] == ' ')):

        target_string = target_string.replace(target_string[x_finded], 'ecks', 1)

    elif (target_string[x_finded+1] != ' '):

        target_string = target_string.replace(target_string[x_finded], 'z', 1)
    else:

        target_string = target_string.replace(target_string[x_finded], 'cks', 1)

return(target_string)
  1. This one technically works, but I just can't get the regex patterns right:

    import re

def multipleRegexSubstitutions(sentence):

patterns = {(r'^[xX]\s'): 'ecks ', (r'[^\w]\s?[xX](?!\w)'): 'ecks',
            (r'[\w][xX]'): 'cks', (r'[\w][xX][\w]'): 'cks',
            (r'^[xX][\w]'): 'z',(r'\s[xX][\w]'): 'z'}

regexes = [
    re.compile(p)
    for p in patterns
]

for regex in regexes:
    for match in re.finditer(regex, sentence):
        match_location = sentence.casefold().find('x', match.start(), match.end())
        sentence = sentence.replace(sentence[match_location], patterns.get(regex.pattern), 1)
return sentence

From what I figured it out, the only problem in the second function is the regex patterns. Could someone help me?

EDIT: Sorry I forgot to tell that the regexes are looking for the different x characters in a string, and replace an X in the beggining of a word for a 'Z', in the middle or end of a word for 'cks' and if it is a lone 'x' char replace with 'ecks'


Solution

  • You need \b (word boundary) and \B (position other than word boundary):

    Replace an X in the beggining of a word for a 'Z'

    re.sub(r'\bX\B', 'Z', s, flags=re.I)
    

    In the middle or end of a word for 'cks'

    re.sub(r'\BX', 'cks', s, flags=re.I)
    

    If it is a lone 'x' char replace with 'ecks'

    re.sub(r'\bX\b', 'ecks', s, flags=re.I)