Search code examples
xmlparsinglualua-patterns

Trying to use match and gmatch string pattern matching functions in lua to parse an xml file but not getting the expected result


I want to parse and print specific data from an XML file using Lua.

Here is a snippet from my XML code:

<Service>
<NewInstance ref="E961625723F5FDC8BD550077282E074C">
<Std>DiscoveredElement</Std>
<Key>E961625723F5FDC8BD550077282E074C</Key>
<Attributes>
<Attribute name="TARGET_TYPE" value="j2ee_application" />
<Attribute name="AppType" value="ear" />
<Attribute name="TARGET_GUID" value="E961625723F5FDC8BD550077282E074C" />
<Attribute name="TARGET_NAME"
value="/Farm_b2b4_sys20_b2b4_domain/b2b4_domain/WLS_B2B4a/worklistapp" />
</Attributes>
</NewInstance>
<NewInstance ref="FD8A116D5C8DD2332B024BCBD6A81BD8">
<Std>DiscoveredElement</Std>
<Key>FD8A116D5C8DD2332B024BCBD6A81BD8</Key>
<Attributes>
<Attribute name="TARGET_TYPE" value="composite" />
<Attribute name="SERVICE_TYPE" value="" />
<Attribute name="TARGET_NAME" value="LAB-DB-B-AIX-Grp" />
<Attribute name="TARGET_GUID" value="FD8A116D5C8DD2332B024BCBD6A81BD8" />
</Attributes>
</NewInstance>
</Service>

From this XML file I want to display the value of TARGET_TYPE and TARGET_NAME corresponding to every NewInstance ref tag in the file

I tried it in 2 ways - Lua code 1 :

local file = io.open("sample.xml", "rb")   -- Open file for reading (binary data)
for instance in file:read("*a"):gmatch("<NewInstance ref=\"(.-)\">") do  -- Read whole file content and iterate through attribute matches

TARGET_TYPE = instance:gmatch('TARGET_TYPE.-value=\"(.-)\"')
TARGET_NAME = instance:gmatch('TARGET_NAME.-value=\"(.-)\"')
print("New Instance :", instance)
print("Target Type : ",TARGET_TYPE)
print("Target Name : ",TARGET_NAME)
end
file:close()

The output that I get for this is -

New Instance :  E961625723F5FDC8BD550077282E074C
Target Type :   function: 0050E9C0
Target Name :   function: 0050EA10
New Instance :  FD8A116D5C8DD2332B024BCBD6A81BD8
Target Type :   function: 0050EA60
Target Name :   function: 0050EAB0
Exit code: 0

It is picking up some random values for target name and target type.

Lua code 2 :

local file = io.open("sample.xml", "rb")   -- Open file   for reading (binary data)
for instance in file:read("*a"):gmatch("<NewInstance ref=\"(.-)\">") do
TARGET_TYPE = instance:match('TARGET_TYPE.-value="(.-)"')
TARGET_NAME = instance:match('TARGET_NAME.-value="(.-)"')
print("New Instance :", instance)
print("Target Type : ",TARGET_TYPE)
print("Target Name : ",TARGET_NAME)
end
file:close()

This gives the output:

lua -e "io.stdout:setvbuf 'no'" "prac.lua" 
New Instance :  E961625723F5FDC8BD550077282E074C
Target Type :   nil
Target Name :   nil
New Instance :  FD8A116D5C8DD2332B024BCBD6A81BD8
Target Type :   nil
Target Name :   nil
Exit code: 0

Please suggest a way to retrieve the required and correct attribute value.


Solution

  • In general, it's not a great idea to use Lua pattern (or regex) to parse XML. Instead, use a XML parser.

    Anyway, for this particular problem, the first code doesn't work because gmatch is an iterator, it's not supposed to be used like that.

    For the second code, the pattern <NewInstance ref=\"(.-)\"> captures only the ref= part, you should instead capture the part between <NewInstance ref= and </NewInstance>:

    for instance in xml:gmatch("<NewInstance ref=\".-\">(.-)</NewInstance>") do