I get a string like this:
AABBBB$CCCDEEE$AABADEE
And I want a result like this:
2A4B$3CD3E$2ABAD2E
To do that, I made a for loop on the string array. It works well:
import re
string = "AABBBB$CCCDEEE$AABADEE"
out_string = string[:]
k = 1
c_old = ""
for c in string:
if c_old==c :
k += 1
else:
if k>1:
s= ""
for i in range(k):
s += c_old
chg = str(k) + c_old
out_string = re.sub(s, chg, out_string, 1)
k = 1
c_old = c
print(out_string)
But with very long strings, it can take a long time.
Is there a way to do what I want without iterating all the string, especially with the re
module?
Not sure why you think re.sub() is appropriate for this. You just need a fairly trivial iteration over the source string.
Something like this:
s = "AABBBB$CCCDEEE$AABADEE"
r = ""
c = 1
p = s[0]
for x in s[1:]:
if x == p:
c += 1
else:
if c == 1:
r += p
else:
r += f"{c}{p}"
c = 1
p = x
else:
r += p if c == 1 else f"{c}{p}"
print(r)
Output:
2A4B$3CD3E$2ABAD2E