Search code examples
pythonregexregex-lookarounds

Python find and replace strings in files with argument that the string is quoted and not part of bigger string


i need a solution to the following problem. I want to replace a dynamic string if found in a file but only if there are quotes surrounding the string, either next to it or with max two spaces between, and not be part of bigger string (in python) :

ori = 'testing'
rep = 'posting'

file contents:

Line1 This is one line with some words for testing purposes
Line2 this is the seconds "testing" function.
Line3 that is one more " testing" line
Line4 "testing"
Line5 "  testing"
Line6 "testing  "
Line7 "  testing  "

Im looking for the following result preferably with regex as simple and efficient way instead of a separate function.

Line1 This is one line with some words for testing purposes
Line2 this is the seconds "testing" function.
Line3 that is one more " testing" line
Line4 "posting"
Line5 "  posting"
Line6 "posting  "
Line7 "  posting  "

regex magicians may help me at this one.

Thanks in advance.


Solution

  • A regular expression would be a good tool for such a task.
    Always take care to express them clearly.
    Regular expressions can quickly become puzzling and hard to debug.

    import re
    
    original = 'testing'
    replacement = 'posting'
    
    line1 = 'This is one line with some words for testing purposes'
    line2 = 'this is the seconds "testing" function.'
    line3 = 'that is one more " testing" line'
    line4 = '"testing"'
    line5 = '"  testing"'
    line6 = '"testing  "'
    line7 = '"  testing  "'
    
    lines = [line1, line2, line3, line4, line5, line6, line7]
    
    starts_with_parentheses = '^"'
    ends_with_parentheses = '"$'
    one_space = ' {1}'
    two_spaces = ' {2}'
    none_one_or_two_spaces = '(|{}|{})'.format(one_space, two_spaces)
    
    query = starts_with_parentheses \
            + none_one_or_two_spaces \
            + original \
            + none_one_or_two_spaces \
            + ends_with_parentheses
    
    for line in lines:
        match = re.search(query, line)
        if match:
            line = line.replace(original, replacement)
    
        print(line)
    

    Outputs:

    This is one line with some words for testing purposes
    this is the seconds "testing" function.
    that is one more " testing" line
    "posting"
    "  posting"
    "posting  "
    "  posting  "