Search code examples
c#xmllinq-to-xml

XDocument.Descendants cannot distinguish between Parent / Child Elements


I have a XML like this:

<?xml version="1.0" encoding="utf-8"?>
<Document>
    <Interface>
        <Sections xmlns="http://www.siemens.com/automation/Openness/SW/Interface/v4">
            <Section Name="Static">
                <Member Name="3bool1" Datatype="&quot;3bool&quot;" Remanence="NonRetain" Accessibility="Public">
                    <AttributeList>
                        <BooleanAttribute Name="ExternalAccessible" SystemDefined="true">true</BooleanAttribute>
                        <BooleanAttribute Name="ExternalVisible" SystemDefined="true">true</BooleanAttribute>
                        <BooleanAttribute Name="ExternalWritable" SystemDefined="true">true</BooleanAttribute>
                        <BooleanAttribute Name="SetPoint" SystemDefined="true">false</BooleanAttribute>
                    </AttributeList>
                    <Sections>
                        <Section Name="None">
                            <Member Name="bool1" Datatype="Bool" />
                            <Member Name="bool2" Datatype="Bool" />
                            <Member Name="bool3" Datatype="Bool" />
                        </Section>
                    </Sections>
                </Member>
                <Member Name="int7" Datatype="Int" Remanence="NonRetain" Accessibility="Public">
                    <AttributeList>
                        <BooleanAttribute Name="ExternalAccessible" SystemDefined="true">true</BooleanAttribute>
                        <BooleanAttribute Name="ExternalVisible" SystemDefined="true">true</BooleanAttribute>
                        <BooleanAttribute Name="ExternalWritable" SystemDefined="true">true</BooleanAttribute>
                        <BooleanAttribute Name="SetPoint" SystemDefined="true">true</BooleanAttribute>
                    </AttributeList>
                </Member>
            </Section>
        </Sections>
    </Interface>
 </Document>

With the following code I can take all the descendants "Member" elements of bool "Datatype":

XNamespace ns = "http://www.siemens.com/automation/Openness/SW/Interface/v4";

var memb = doc.Descendants(ns + "Member")
     .Select(f => f.Attribute("Datatype"))
     .Where(n => n.Value.Contains("Bool"))
     .ToList();

memb.ForEach(i => Console.WriteLine("{0}\t", i));

What I want to do is:

  1. Search if some "Member" elements have or not some "Member" child elements (composed data types);
  2. Extract all child elements of that specific parent element (in this case bool1, bool2, bool3)

Solution

  • You have an XDocument with a recursive schema in which <Member> descendants contain nested <Member> descendants. You would like to iterate through the document, returning top-level <Member> elements grouped with their topmost <Member> descendants, then extract some data from those descendants. Currently you are using XDocument.Descendants() to iterate through the document, which doesn't do a good job of conveniently capturing the grouping of parents and children.

    One way to do this would be to use the method XElementExtensions.DescendantsUntil(this XElement root, Func<XElement, bool> predicate, bool includeSelf) from this answer to How to find highest level descendants with a given name to find the topmost <Member> elements, then for each of those, use DescendantsUntil() again to find their topmost <Member> descendants.

    First define DescendantsUntil() as follows:

    public static partial class XElementExtensions
    {
        /// <summary>
        /// Enumerates through all descendants of the given element, returning the topmost elements that match the given predicate
        /// </summary>
        /// <param name="root"></param>
        /// <param name="filter"></param>
        /// <returns></returns>
        public static IEnumerable<XElement> DescendantsUntil(this XElement? root, Func<XElement, bool> predicate)
        {
            if (predicate == null)
                throw new ArgumentNullException(nameof(predicate));
            return GetDescendantsUntil(root, predicate, false);
        }
    
        static IEnumerable<XElement> GetDescendantsUntil(XElement? root, Func<XElement, bool> predicate, bool includeSelf)
        {
            if (root == null)
                yield break;
            if (includeSelf && predicate(root))
            {
                yield return root;
                yield break;
            }
            var current = root.FirstChild<XElement>();
            while (current != null)
            {
                var isMatch = predicate(current);
                if (isMatch)
                    yield return current;
    
                // If not a match, get the first child of the current element.
                var next = (isMatch ? null : current.FirstChild<XElement>());
    
                if (next == null)
                    // If no first child, get the next sibling of the current element.
                    next = current.NextSibling<XElement>();
    
                // If no more siblings, crawl up the list of parents until hitting the root, getting the next sibling of the lowest parent that has more siblings.
                if (next == null)
                {
                    for (var parent = current.Parent as XElement; parent != null && parent != root && next == null; parent = parent.Parent as XElement)
                    {
                        next = parent.NextSibling<XElement>();
                    }
                }
    
                current = next;
            }
        }
    
        public static TNode? FirstChild<TNode>(this XNode node) where TNode : XNode => node switch
            {
                XContainer container => container.FirstNode?.NextSibling<TNode>(true),
                _ => null,
            };
    
        public static TNode? NextSibling<TNode>(this XNode node) where TNode : XNode =>
            node.NextSibling<TNode>(false);
    
        public static TNode? NextSibling<TNode>(this XNode node, bool includeSelf) where TNode : XNode
        {
            for (var current = (includeSelf ? node : node.NextNode); current != null; current = current.NextNode)
            {
                var nextTNode = current as TNode;
                if (nextTNode != null)
                    return nextTNode;
            }
            return null;
        }
        //Taken from this answer https://stackoverflow.com/a/46016931/3744182
        //To https://stackoverflow.com/questions/46007738/how-to-find-first-descendant-level
        //With added nullability annotations and updated syntax
    }
    

    And now you can do:

    XNamespace ns = "http://www.siemens.com/automation/Openness/SW/Interface/v4";
    XName name = ns + "Member";
    
    var members = doc
        .Root.DescendantsUntil(e => e.Name == name)
        .Select(e => (Parent: e, Children : e.DescendantsUntil(c => c.Name == name).ToList()))
        //.Where(i => i.Children.Count > 0) // Uncomment to filter out <Member> elements with no child members.
        .ToList();
    
    members.ForEach(i => Console.WriteLine("Parent: \"{0}\", Children: {1}", 
                                           i.Parent.Attribute("Name")?.Value, 
                                           i.Children.Count == 0 
                                           ? "None" :
                                           string.Join(",", 
                                                       i.Children.Select(c => $"\"{c.Attribute("Name")?.Value}\""))));
    

    Which prints

    Parent: "3bool1", Children: "bool1","bool2","bool3"
    Parent: "int7", Children: None
    

    If you are not interested in the <Member Name="int7" element which has no <Member> children, you may filter such elements out by uncommenting the Where() clause above:

        .Where(i => i.Children.Count > 0) // Uncomment to filter out <Member> elements with no child members.
    

    Demo fiddle here.