I need to replace certain values in an external file ("file.txt
") with new values from a Pandas dataframe. The external file contents look like:
(Many lines of comments, then)
identifier1 label2 = i \ label3 label4
label5
A1 = -5563.88 B2 = -4998 C3 = -203.8888 D4 = 5926.8
E5 = 24.99876 F6 = 100.6666 G7 = 30.008 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.1
M12
identifier2 label2 = i \ label3 label4
label5
A1 = -788 B2 = -6554 C3 = -100.23 D4 = 7526.8
E5 = 20.99876 F6 = 10.6666 G7 = 20.098 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.000
M12
...
From previous posts here, this resource, and Python's "re", I'm trying:
findThisIdentifierInFile = "identifier1" # I want the data immediately below this identifier in the external file
with open("file.txt", "r") as file:
file_string = file.read()
i = -500 # New A1 value (i.e., I want to replace the A1 value in the file with -500).
j = 100 # New C3 value.
string1 = re.sub(
rf"^({findThisIdentifierInFile}\s.*?)A1 = \S+ C3 = \S+",
f"\g<1>A1 = {i} C3 = {j}",
string1,
flags=re.M | re.S,
)
When I run this, there are no errors, but nothing happens. For example, when I print "string1
", the data are identical to those in the original "file.txt
". I can't provide more of the code but hope that someone who is experienced with RegEx and re (Python) will be able to spot where I have gone wrong. I apologize in advance because I'm certain to have done something silly.
Sometimes I will also want to replace the B2
value and the E5
- H8
values and values on the other lines. I'm wondering whether there's a more foolproof/newbie-friendly method I could use to do any possible replacement of values immediately below a particular identifying label.
IIUC, you can do the string replacement in multiple steps, e.g.:
import re
text = r"""
(Many lines of comments, then)
identifier1 label2 = i \ label3 label4
label5
A1 = -5563.88 B2 = -4998 C3 = -203.8888 D4 = 5926.8
E5 = 24.99876 F6 = 100.6666 G7 = 30.008 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.1
M12
identifier2 label2 = i \ label3 label4
label5
A1 = -788 B2 = -6554 C3 = -100.23 D4 = 7526.8
E5 = 20.99876 F6 = 10.6666 G7 = 20.098 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.000
M12
..."""
def my_replace_function(g):
i = -500 # New A1 value
j = 100 # New C3 value
s = g.group(2)
s = re.sub(r"A1 = \S+", f"A1 = {i}", s)
s = re.sub(r"C3 = \S+", f"C3 = {j}", s)
return g.group(1) + s + "\n\n"
findThisIdentifierInFile = "identifier1"
text = re.sub(
rf"^({findThisIdentifierInFile})(.*?)\n\n",
my_replace_function,
text,
flags=re.M | re.S,
)
print(text)
Prints:
(Many lines of comments, then)
identifier1 label2 = i \ label3 label4
label5
A1 = -500 B2 = -4998 C3 = 100 D4 = 5926.8
E5 = 24.99876 F6 = 100.6666 G7 = 30.008 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.1
M12
identifier2 label2 = i \ label3 label4
label5
A1 = -788 B2 = -6554 C3 = -100.23 D4 = 7526.8
E5 = 20.99876 F6 = 10.6666 G7 = 20.098 H8 = 10.9999
J9 = 1000000 K10 = 1.0002 L11 = 0.000
M12
...