I have a doc.txt which is like "2A4CT2A2C..." and i want to get "AACCCCTAACC..." and then write it to another doc1.txt I have tried:
(origin and destination are the paths of the docs)
def decode_txt(origin, destination):
h = open(destination, "w")
f = open(origin, "r")
for character in f:
h.write()
and couldn't think how to continue
You have a pattern of zero or more digits followed by a single character. A regular expression can handle it. (\d*)
will group zero or more digits followed by a ([^\d])
- a single non-digit character to repeat.
import re
def decode_txt(origin, destination):
with open (origin) as infile:
text = infile.read()
with open(destination, "w") as outfile:
for cnt, char in re.findall(r"(\d*)([^d])", text):
outfile.write(char * (int(cnt) if cnt else 1))
test = "2A4CT2A2C"
open("origin", "w").write(test)
decode_txt("origin", "destination")
print(open("destination").read())
assert open("destination").read() == "AACCCCTAACC"
Suppose you just wanted string input and output. This could reduce to
import re
text = "2A4CT2A2C"
out = []
for cnt, char in re.findall(r"(\d*)([^d])", text):
out.extend(char * (int(cnt) if cnt else 1))
out = "".join(out)
If you have a lot of text, the out
list will be large. You could use io.StringIO()
to create a file-like buffer instead.