python python-3.x pandas replace python-re

Python re: why isn't my code for replacing values in a file throwing errors or changing the values?

I need to replace certain values in an external file ("file.txt") with new values from a Pandas dataframe. The external file contents look like:

(Many lines of comments, then)
identifier1       label2 = i \ label3        label4                                  
label5
A1 = -5563.88 B2 = -4998 C3 = -203.8888 D4 = 5926.8 
E5 = 24.99876 F6 = 100.6666 G7 = 30.008 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.1
M12

identifier2       label2 = i \ label3        label4                                  
label5
A1 = -788 B2 = -6554 C3 = -100.23 D4 = 7526.8 
E5 = 20.99876 F6 = 10.6666 G7 = 20.098 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.000
M12
...

From previous posts here, this resource, and Python's "re", I'm trying:

findThisIdentifierInFile = "identifier1" # I want the data immediately below this identifier in the external file

with open("file.txt", "r") as file:
    file_string = file.read()

    i = -500 # New A1 value (i.e., I want to replace the A1 value in the file with -500).
    j = 100  # New C3 value.  

    string1 = re.sub(
        rf"^({findThisIdentifierInFile}\s.*?)A1 = \S+ C3 = \S+",
        f"\g<1>A1 = {i} C3 = {j}",
        string1,
        flags=re.M | re.S,
    )

When I run this, there are no errors, but nothing happens. For example, when I print "string1", the data are identical to those in the original "file.txt". I can't provide more of the code but hope that someone who is experienced with RegEx and re (Python) will be able to spot where I have gone wrong. I apologize in advance because I'm certain to have done something silly.

Sometimes I will also want to replace the B2 value and the E5 - H8 values and values on the other lines. I'm wondering whether there's a more foolproof/newbie-friendly method I could use to do any possible replacement of values immediately below a particular identifying label.

Solution

IIUC, you can do the string replacement in multiple steps, e.g.:

import re

text = r"""
(Many lines of comments, then)
identifier1       label2 = i \ label3        label4
label5
A1 = -5563.88 B2 = -4998 C3 = -203.8888 D4 = 5926.8
E5 = 24.99876 F6 = 100.6666 G7 = 30.008 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.1
M12

identifier2       label2 = i \ label3        label4
label5
A1 = -788 B2 = -6554 C3 = -100.23 D4 = 7526.8
E5 = 20.99876 F6 = 10.6666 G7 = 20.098 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.000
M12
..."""


def my_replace_function(g):
    i = -500  # New A1 value
    j = 100  # New C3 value

    s = g.group(2)

    s = re.sub(r"A1 = \S+", f"A1 = {i}", s)
    s = re.sub(r"C3 = \S+", f"C3 = {j}", s)

    return g.group(1) + s + "\n\n"


findThisIdentifierInFile = "identifier1"
text = re.sub(
    rf"^({findThisIdentifierInFile})(.*?)\n\n",
    my_replace_function,
    text,
    flags=re.M | re.S,
)
print(text)

Prints:


(Many lines of comments, then)
identifier1       label2 = i \ label3        label4
label5
A1 = -500 B2 = -4998 C3 = 100 D4 = 5926.8
E5 = 24.99876 F6 = 100.6666 G7 = 30.008 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.1
M12

identifier2       label2 = i \ label3        label4
label5
A1 = -788 B2 = -6554 C3 = -100.23 D4 = 7526.8
E5 = 20.99876 F6 = 10.6666 G7 = 20.098 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.000
M12
...