I have a regex expression that traverses a string and pulls out 40 values, it looks sort if like the query below, but much larger and more complicated
est(.*)/test>test>(.*)<test><test>(.*)test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test>
My question is how do I use these expressions with the replace command when the number exceeds 9. It seems as if whenever I use \10
it returns the value for \1
and then appends a 0
to the end.
Any help would be much appreciated thanks :)
Also I am using UEStudio, but if a different program does it better then no biggie :)
Most of the simple Regex engines used by editors aren't equipped to handle more than 10 matching groups; it doesn't seem like UltraEdit can. I just tried Notepad++ and it won't even match a regex with 10 groups.
Your best bet, I think, is to write something fast in a quick language with a decent regex parser. but that wouldn't answer the question as asked
Here's something in Python:
import re
pattern = re.compile('(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)')
with open('input.txt', 'r') as f:
for line in f:
m = pattern.match(line)
print m.groups()
Note that Python allows backreferences such as \20
: in order to have a backreference to group 2 followed by a literal 0, you need to use \g<2>0
, which is unambiguous.
Edit: Most flavors of regex, and editors which include a regex engine, should follow the replace syntax as follows:
abcdefghijklmnop
search: (.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(?<name>.)(.)
note: 1 2 3 4 5 6 7 8 9 10 11 12 13
value: a b c d e f g h i j k l m
replace result:
\11 k1 i.e.: match 1, then the character "1"
${12} l most should support this
${name} l few support named references, but use them where you can.
Named references are usually only possible in very specific flavor of regex libraries, test your tool to know for sure.