Search code examples
xmlxpathtcltdom

Parsing XML data with multiple children in TCL tDOM


My XML File:

<?xml version="1.0"?>
<root>
<msg>
    <MessageError>
        <BookingID>123</BookingID>
        <Error>Invalid patient name</Error>
        <Error>PATIENT NOT FOUND</Error>
        <Message>Incoming MESSAGE DATA 1</Message>
    </MessageError>
    <MessageError>
        <BookingID>456</BookingID>
        <Error>Undefined patient account number.</Error>
        <Error>Undefined Account Number</Error>
        <Message>Incoming MESSAGE DATA 2</Message>
    </MessageError>
    <MessageError>
        <BookingID>789</BookingID>
        <Error>DOB invalid</Error>
        <Message>Incoming MESSAGE DATA 3</Message>
    </MessageError>
</msg>
</root>

My tcl Script:

        set dom [dom parse $msg]
        set root [$dom documentElement]         

        set MessageError [$root selectNodes "/root/msg/MessageError" ]
        foreach node $MessageError {
            set Error [$root selectNodes {/root/msg/MessageError/Error} ]
            #set bookingid [$MessageError text]
            #echo "BookingIDXML - $bookingid"
            #set message [$MessageError text]
            #echo "MessageXML - $message"

            foreach errornode $Error {
                set error [$errornode text]
                echo "ErrorXML - $error"
            }
        }

My output:

ErrorXML - Invalid patient name
ErrorXML - PATIENT NOT FOUND
ErrorXML - Undefined patient account number.
ErrorXML - Undefined Account Number
ErrorXML - DOB invalid
ErrorXML - Invalid patient name
ErrorXML - PATIENT NOT FOUND
ErrorXML - Undefined patient account number.
ErrorXML - Undefined Account Number
ErrorXML - DOB invalid
ErrorXML - Invalid patient name
ErrorXML - PATIENT NOT FOUND
ErrorXML - Undefined patient account number.
ErrorXML - Undefined Account Number
ErrorXML - DOB invalid

There is lack of documentation in the internet with this powerful tool. How do I get an output of? The commented '#' sections of my code doesn't work.

BookingIDXML - 123
ErrorXML - Invalid patient name
MessageXML - Incoming MESSAGE DATA 1

BookingIDXML - 123
ErrorXML - PATIENT NOT FOUND
MessageXML - Incoming MESSAGE DATA 1

BookingIDXML - 456
ErrorXML - Undefined patient account number.
MessageXML - Incoming MESSAGE DATA 2

BookingIDXML - 465
ErrorXML - Undefined Account Number
MessageXML - Incoming MESSAGE DATA 2

BookingIDXML - 789
ErrorXML - DOB invalid
MessageXML - Incoming MESSAGE DATA 3

Thanks in Advance.


Solution

  • The selectNodes method uses XPath (which is very well documented) to find the results to return, with the context node being the object on which you invoke the method. Thus, to find the Error nodes for a particular MessageError, you have to start at the right point. In particular, you probably want the code to do something like this:

    foreach messageError [$root selectNodes "/root/msg/MessageError"] {
        # Print some general info (to separate error groups)
        set bookingID [lindex [$messageError selectNodes "BookingID"] 0]
        puts "ID: [$bookingID text]"
        set message [lindex [$messageError selectNodes "Message"] 0]
        puts "Message: [$message text]"
        # Print the errors
        foreach error [$messageError selectNodes "Error"] {
            puts "Error: [$error text]"
        }
    }
    

    If you prefer, you could use ./Error instead of Error as the XPath search term; the effect would be the same but it looks a bit more like a path. What you shouldn't do is start over the search from the root of the document (as /root/msg/MessageError/Error would do) because then you find everything that matches that path, and not just the bits that are within the current sub-context. (Think of the sub-context a bit like the current directory in a filesystem, and the elements as being a little bit like directories; it's a partial analogy — DOM trees aren't directories — but it's still a bit analogous.)