I have 500x500 bitmaps containing no more than 16 colors that I need to convert to a text file where each color is represented by a character.
I then need to reduce the size of the text file by finding patterns in each line.
I have the characters right now in a 2D array.
For example:
AHAHAH = 3(AH)
HAHAHA = 3(HA)
AAAHHH = 3(A)3(H)
ABYZTT = ABYZ2(T)
AHAHAB = 2(AH)AB
I don't think I can use regular expressions because there are so many possible combinations.
I am not even sure where to begin.
Here is what I did to solve my problem. I haven't thoroughly checked edge cases, but it's working on my test inputs. Maybe it will be helpful for someone in the future. It's Run-Length Encoding, but for groups of characters, not individual characters. From what I read, normal RLE would encode AAAAHAHA as A4H1A1H1A1, whereas I needed to encode 4A2HA.
string='AHYAHYAHAHAHAHAHAHAHBBBBBBBTATAZAB*+I'
length=len(string)
half=round(length/2)
new_string=""
i=1
while i<=half and string:
if i>length-i:
pass
sub_string1=string[:i]
sub_string2=string[i:i+i]
if sub_string1==sub_string2:
match=True
count=1
while match is True:
sub_string1=string[count*i:(count+1)*i]
sub_string2=string[(count+1)*i:(count+2)*i]
if sub_string1 == sub_string2:
count+=1
else:
match=False
new_string+="("+str(count+1)+")"+sub_string1
string=string[count*i+i:]
i=1
else:
if i==len(string):
new_string+=string[0]
string=string[1:]
i=1
else:
i+=1
print(new_string)
(2)AHY(7)AH(7)B(2)TAZAB*+I