Search code examples
stringluapattern-matchinglua-patterns

Lua XML extract from pattern


An application is sending my script an Stream like this one:

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <aRootChildNode>
    <anotherChildNode>
     <?xml version="1.0">
     <TheNodeImLookingFor>
       ... content ...
     </TheNodeImLookingFor>
    </anotherChildNode>
   </aRootChildNode>
</root>

I want to extract the TheNodeImLookingFor section. So far, got:

data = string.match(Stream, "^.+\<TheNodeImLookingFor\>.+\<\/TheNodeImLookingFor\>.+$")

Pattern is recognized in the Stream, but it doesn't extract the node and its content.


Solution

  • In general, it's not a good idea to use pattern matching (either Lua pattern or regex) to extract XML. Use a XML parser.

    For this problem, you don't need to escape \ or <(even if you do, Lua pattern uses % to escape magic characters). And use brackets to get the node and its content:

    data = string.match(Stream, "^.+(<TheNodeImLookingFor>.+</TheNodeImLookingFor>).+$")
    

    Or to get only the content:

    data = string.match(Stream, "^.+<TheNodeImLookingFor>(.+)</TheNodeImLookingFor>.+$")