I have an XML document that looks kinda like this:
<root>
Maybe some text
<thing>thing can have text</thing>
<thing>it can even be on multiple
lines
</thing>
<thing>a third thing</thing>
This text resets the numbering
<thing>this thing is not part of the above list and should have number 1</thing>
<some-element-not-thing>Also resets numbering</some-element-not-thing>
<thing>this thing should also have number 1<thing/>
</root>
I need to number the <thing>
s when they come consecutively, by giving each an attribute called "number". That is, my desired results is:
<root>
Maybe some text
<thing number="1">thing can have text</thing>
<thing number="2">it can even be on multiple
lines
</thing>
<thing number="3">a third thing</thing>
This text resets the numbering
<thing number="1">this thing is not part of the above list and should have number 1</thing>
<some-element-not-thing>Also resets numbering</some-element-not-thing>
<thing number="1">this thing should also have number 1<thing/>
</root>
How would I approach something like this? I can't see a way to find text between elements in XmlDocument (but it does let me enumerate elements by order, so I can reset numbering when I encounter something that is not <thing>
), and I am not sure LINQ to XML allows me to get text between elements either, as it will only yield elements or descendants, neither of which represent the "loose text".
Perhaps this "loose text" is bad (but apparently parse-able) XML?
EDIT: I completely misunderstood my own problem. Apparently there is no text between the elements, it was the result of an error I fixed afterwards. The solution I ended up using was just enumerating the nodes and altering their attributes that way (using XML Document and ignoring whitespace), similar to what was suggested below. I apologize for not turning this question around in my head more and/or spending more time researching. If people think this question does not contribute to SO adequately I will not mind deleting it.
As always, it would be helpful if you provided what you've already tried before asking questions. There are lots of blog posts and questions about parsing and manipulating XML.
As a start, I would parse using LINQ to XML. Then all you have to do is loop through the nodes below the root element, assigning each thing
element an incrementing number. This counter is reset when the next element is not a thing
and not whitespace:
var doc = XDocument.Parse(xml, LoadOptions.PreserveWhitespace);
var i = 0;
foreach (var node in doc.Root.Nodes())
{
var element = node as XElement;
var text = node as XText;
var isThing = element != null && element.Name == "thing";
var isWhitespace = text != null && string.IsNullOrWhiteSpace(text.Value);
if (isThing)
{
element.Add(new XAttribute("number", ++i));
}
else if (!isWhitespace)
{
i = 0;
}
}
var result = doc.ToString();