I have thousands of non well-formed XML files to patch up.
Many of them contain the following issue: <someTag attr='text [< 99]'/>
(note left angle in square brackets).
I would like to write a sed expression to replace all instances of [<
with [<
for *.xml.
sed -n 19p myFile.xml
returns <someTag attr='text [<99]'/>
as expected.
echo '[<45' | sed -n '/\[</p'
returns [<45
as expected.
However, sed -n '/\[</p' myFile.xml
returns nothing so apparently I need a different syntax when using that expression against a file as opposed to echo. What syntax do I need to use?
Also, once I have this done, my plan is to do something like
sed -i -n 's/correct expression/\[</g/p' *.xml
to run it against all matches in all files and output the new version to help me debug. Does that seem reasonable?
BTW, sed seemed like the tool to use, but I'm perfectly fine using any other solution that runs on Linux.
Thanks!
However,
sed -n '/\[</p' myFile.xml
returns nothing so apparently I need a different syntax when using that expression against a file as opposed to echo.
Hm, works for me:
echo '[<45' > test.xml
sed -n '/\[</p' test.xml
returns [<45
.
That said, if you want to replace, do something like
sed 's/\[</[\</g'
For example, to modify all xml files directly, do
sed -i 's/\[</[\</g' *.xml
(the -i switch is for directly modifying the files; otherwise, their contents will be sent to stdout)
Does that seem reasonable?
Sure, that is what sed is for.