I'm trying to find the cleanest way to perform a run-length encoding based on patterns. The goal is to compress a string by factorizing a substring composed of several same patterns.
Original String:
start{3}{3}{3}{3}end
As you can see, there are 4 "{3}
" patterns. It's possible to compress this String by expressing the run of 4 "{3}
" patterns as $4{3}
.
Compressed String I would like to obtain:
start$4{3}end
I tried the String.replaceAll(regex, replacement)
method. I know that myString.replaceAll("\\{([^<])\\}", "$1")
can replace a whole pattern by its value only but I can't find how to detect and count a same-pattern run length using regular expressions.
Is using regular expression a good idea or are there any other 'better' way to do this?
I just get the output as follows. There should be more efficient approach than this. But hopefully this will help you
String s = "start{3}{3}{3}{3}end";
String pString = "\\{3\\}";
Pattern p = Pattern.compile(pString);
Matcher m = p.matcher(s);
int count = 0;
while (m.find()) {
count++;
}
System.out.println(s.replaceAll(pString, "-").replaceFirst("-{"+count+"}", "\\$"+count+pString));