I apply some regular expression on xml file to find and replace values. Normally it works.(I heard the voices saying "use xml parsers". Meanwhile I can not.) But if there is a special character in the value, it ruins everything.
Think I have a xml file like below:
<fieldset>
<idle1>
<value>something\\n</value>
</idle1>
<idle2>
<value>blabla</value>
</idle2>
</fieldset>
If I try to replace value in "<idle2><value>
" node, value of "<idle1><value>
" node becomes "something\n". And when it comes to writing to file, xml becomes:
<fieldset>
<idle1>
<value>something
</value>
</idle1>
<idle2>
<value>blabla</value>
</idle2>
</fieldset>
Well both in search and replace i use "r" string literal. But it seems not working. I solve the problem. For every search and replace, I replace "\n"s with "\\n
" and then I write result to the file. But it is not an efficient way to use.
Is there something I could not see? I just want to write "\\n
" to the files. Is this so much for me to want it?
Edit: here is my regexs':
for search :
self.searchPattern=(<fieldset>)(.*?)(<idle2>)(.*?)(<value>)(.*?)(</value>)(.*?)(</idle2>)(.*?)(</fieldset>)
for replace :
self.replacePattern=`\g<1>\g<2>\g<3>\g<4><value>denemeasdasd\\\\n</value>\g<8>\g<9>\g<10>\g<11>`
this is the python code for search:
self.pattern = re.compile(r''''''+self.searchPattern+'''''', flags = re.S | re.U)
and this is for replacing
outtext = self.pattern.sub(r''''''+self.replacePattern+'''''',r''''''+self.match.group(0)+'''''')
I don't understand your explanations.
Personnaly, I wrote this:
import re
RE = ('(^([ \t]+)<(idle2)>(?:\n|\r\n?)[ \t]+<value>)'
'(.*?)'
'(?=</value>(?:\n|\r\n?)\\2</\\3>)')
print repr(ch),'\n'
print ch
print '\n-------------------------------------------------'
print repr(re.sub(RE,'\\1AAA',ch,flags = re.M)) , '\n'
print re.sub(RE,'\\1-----HHHHHHXXXXXXX-------',ch,flags = re.M)
result
'<fieldset>\n <idle1>\n <value>something\\n</value>\n </idle1>\n <idle2>\n <value>blabla</value>\n </idle2>\n</fieldset>'
<fieldset>
<idle1>
<value>something\n</value>
</idle1>
<idle2>
<value>blabla</value>
</idle2>
</fieldset>
-------------------------------------------------
'<fieldset>\n <idle1>\n <value>something\\n</value>\n </idle1>\n <idle2>\n <value>AAA</value>\n </idle2>\n</fieldset>'
<fieldset>
<idle1>
<value>something\n</value>
</idle1>
<idle2>
<value>-----HHHHHHXXXXXXX-------</value>
</idle2>
</fieldset>
Is it what you want ?