Search code examples
asp.net-corelinq-to-xmlxmldocument

group XDocument by multiple nodes (dynamic)


I have below data coming in as DataTable. If I directly convert it to XML I get nice xdocument object.

enter image description here

However problem is I need to group it by first 4 columns so that XML is shown as below. I know of only three nodes 'Segment', 'Price' and 'Qty' rest columns in datatable could be dynamic and cannot use hardcoded names (except for above 3)

<ROOT>
<ROW>
    <Col1>CESLP</Col1>
    <Col2>MRP</Col2>
    <Col3>372</Col3>
    <Date>20040101</Date>
    <BID_INTERVALS>
        <SEGMENT>1</SEGMENT>
        <Price>10</Price>
        <QTY>5</QTY>
    </BID_INTERVALS>
    <BID_INTERVALS>
        <SEGMENT>2</SEGMENT>
        <Price>15</Price>
        <QTY>6</QTY>
    </BID_INTERVALS>
</ROW>
<ROW>
    <Col1>CESLP</Col1>
    <Col2>MRP</Col2>
    <Col3>372</Col3>
    <Date>20040102</Date>
    <BID_INTERVALS>
        <SEGMENT>1</SEGMENT>
        <Price>11</Price>
        <QTY>5</QTY>
    </BID_INTERVALS>
    <BID_INTERVALS>
        <SEGMENT>2</SEGMENT>
        <Price>14.5</Price>
        <QTY>6</QTY>
    </BID_INTERVALS>
</ROW>

Any solution? I'm stuck for quite some time, tried xdocument group by 'except' but didn't worked for me.

Edit1:

I'm using below code to group the records (using solution from here

dataTable.AsEnumerable()
                .GroupBy(r => new NTuple<object>(from column in colNames select r[column]))
                .Select(g => g.CopyToDataTable()).ToList();

Solution

  • It's not entirely clear from the bitmap in your question whether you intially have a DataTable or an XDocument. So let's assume you have an XDocument, and you would like to group the child rows of the root element by the values of their first four columns, with the remaining values collected under a sequence of elements named <BID_INTERVALS>.

    This can be accomplished using the following extension method:

    public static partial class XNodeExtensions
    {
        public static XElement CopyAndGroupChildrenByColumns(this XElement root, Func<XName, int, bool> columnFilter, XName groupName) =>
            new XElement(root.Name, 
                         root.Attributes(),
                         root.Elements()
                         .Select((row) => (row, key : row.Elements().Where((e, i) => columnFilter(e.Name, i)).Select(e => (e.Name, e.Value)).ToHashSet()))
                         .GroupByKeyAndSet(pair => pair.row.Name, pair => pair.key)
                         .Select(g => new XElement(g.Key.Key, 
                                                   g.Key.Set.Select(p => new XElement(p.Name, p.Value)).Concat(g.Select(i => new XElement(groupName, i.row.Elements().Where((e, i) => !columnFilter(e.Name, i))))))));
    
        public static IEnumerable<IGrouping<(TKey Key, HashSet<TItem> Set), TSource>> GroupByKeyAndSet<TSource, TKey, TItem>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, Func<TSource, HashSet<TItem>> setSelector) =>
            Enumerable.GroupBy(source, (i) => (keySelector(i), setSelector(i)), new CombinedComparer<TKey, HashSet<TItem>>(null, HashSet<TItem>.CreateSetComparer()));
    }
    
    public class CombinedComparer<T1, T2> : IEqualityComparer<ValueTuple<T1, T2>>
    {
        readonly IEqualityComparer<T1> comparer1;
        readonly IEqualityComparer<T2> comparer2;
        public CombinedComparer(IEqualityComparer<T1> comparer1, IEqualityComparer<T2> comparer2) => (this.comparer1, this.comparer2) = (comparer1 ?? EqualityComparer<T1>.Default, comparer2 ?? EqualityComparer<T2>.Default);
        public bool Equals(ValueTuple<T1, T2> x, ValueTuple<T1, T2> y) => comparer1.Equals(x.Item1, y.Item1) && comparer2.Equals(x.Item2, y.Item2);
        public int GetHashCode(ValueTuple<T1, T2> obj) => HashCode.Combine(comparer1.GetHashCode(obj.Item1), comparer2.GetHashCode(obj.Item2));
    }
    

    Then, given some XDocument doc, you can do:

    // Group by the first four columns with all remaining elements collected under a <BID_INTERVALS> sequence of elements:
    XName groupName = doc.Root.Name.Namespace + "BID_INTERVALS";
    var grouped = doc.Root.CopyAndGroupChildrenByColumns((n, i) => (i < 4), groupName);
    
    var newDoc = new XDocument(grouped);
    

    If, on the other hand, you have a DataTable dt not an XDocument, you can convert the table to an XDocument directly using the following extension method:

    public static partial class XNodeExtensions
    {
        public static XDocument ToXDocument(this DataTable dt, XmlWriteMode mode = XmlWriteMode.IgnoreSchema)
        {
            var doc = new XDocument();
            using (var writer = doc.CreateWriter())
                dt.WriteXml(writer, mode);
            return doc;
        }
    }
    

    And then do:

    var doc = dt.ToXDocument(XmlWriteMode.IgnoreSchema);
    

    Demo fiddle here.