Search code examples
c#xmlreaderxmlnode

Getting XPath for node with XmlReader


How to get XPath for the current node with XMLReader?

E.g.:

<Employee>
    <Entity>
        <Id>1</Id>
    </Entity>
</Employee>

So I need to get XPath for 1 which is Employee/Entity/Id. Any ideas?

using (var reader = XmlReader.Create(basePath, settings))
{
    while (reader.Read())
    {                   
        if (reader.NodeType == XmlNodeType.Text)
        {
            // need to get xpath of the text node
        }
        else if (reader.NodeType == XmlNodeType.Element)
        {
            // need to get xpath of the current node
        }
     }
 }

Solution

  • My first suggestion would be to use a higher level API like LINQ to XML. The only reason to use a low level API like XmlReader is for extremely large files. With LINQ to XML, a naive implementation is fairly trivial:

    var doc = XDocument.Parse(xml);
    
    foreach (var element in doc.Descendants())
    {
        var path = element.AncestorsAndSelf().Select(e => e.Name.LocalName).Reverse();
        var xPath = string.Join("/", path);
    }
    

    Using XmlReader is a bit more involved as you have to track the element path as you go:

    using (var reader = XmlReader.Create(basePath, settings))
    {
        var elements = new Stack<string>();
    
        while (reader.Read())
        {
            switch (reader.NodeType)
            {
                case XmlNodeType.Element:
                    if(!reader.IsEmptyElement)
                        elements.Push(reader.LocalName);
                    break;
                case XmlNodeType.EndElement:
                    elements.Pop();
                    break;
                case XmlNodeType.Text:
                    path = string.Join("/", elements.Reverse());
                    break;
            }
        }
    }
    

    Here's a working demo: https://dotnetfiddle.net/dpOzuL

    Note that while this works for your trivial example, this is a very naive creation of an XPath expression and won't work in all cases (for example, when you have multiple siblings with the same name or when namespaces are involved).