I have an array of xpath values and an xml feed.
When the feed comes in, I want to filter each xml file by removing the nodes that are not in my array of xpath's.
I can think of a very dirty way to do this:
1) for each node in the xml, i form its xpath
2) check if it's in the array.
3) if not, remove.
Is there a cleaner way?
When the feed comes in, I want to filter each xml file by removing the nodes that are not in my array of xpath's
Step1. Select all nodes that aren't selected by the given XPath expressions
I guess that by "nodes" you mean elements. If so, this XPath expression:
//*[count(. | yourExpr1 | yourExpr2 ... | yourExprN)
>
count(yourExpr1 | yourExpr2 ... | yourExprN)
]
selects all elements in the XML document that aren't selected by any of your N XPath expressions yourExpr1
, yourExpr2
, ... , yourExprN
If by "nodes" you mean elements, text-nodes, processing-instruction-nodes (PIs), comment-nodes and attribute nodes, use this XPath expression to select all nodes not selected by your N XPath expressions:
(//node() | //*/@*)
[count(. | yourExpr1 | yourExpr2 ... | yourExprN)
>
count(yourExpr1 | yourExpr2 ... | yourExprN)
]
Step2. Delete all nodes selected in Step1.
For each of the nodes selected in Step1 above, use:
node.ParentNode.RemoveChild(node);
Explanation:
The XPath union operator |
produces the union of two node-sets. Therefore the expression yourExpr1 | yourExpr2 ... | yourExprN
when applied on the XML document produces the set of all nodes that are selected by any of the N given XPath expressions.
A node $n
doesn't belong to a set of nodes $ns
exactly when
...
count($n | $ns) > count($ns)