How to remove xml nodes that are not in an array of xpath strings?

I have an array of xpath values and an xml feed.

When the feed comes in, I want to filter each xml file by removing the nodes that are not in my array of xpath's.

I can think of a very dirty way to do this:

1) for each node in the xml, i form its xpath

2) check if it's in the array.

3) if not, remove.

Is there a cleaner way?

Solution

When the feed comes in, I want to filter each xml file by removing the nodes that are not in my array of xpath's

Step1. Select all nodes that aren't selected by the given XPath expressions

I guess that by "nodes" you mean elements. If so, this XPath expression:

//*[count(. | yourExpr1 | yourExpr2 ... | yourExprN)
   >
    count(yourExpr1 | yourExpr2 ... | yourExprN)
   ]

selects all elements in the XML document that aren't selected by any of your N XPath expressions yourExpr1, yourExpr2, ... , yourExprN

If by "nodes" you mean elements, text-nodes, processing-instruction-nodes (PIs), comment-nodes and attribute nodes, use this XPath expression to select all nodes not selected by your N XPath expressions:

(//node() | //*/@*)
   [count(. | yourExpr1 | yourExpr2 ... | yourExprN)
   >
    count(yourExpr1 | yourExpr2 ... | yourExprN)
   ]

Step2. Delete all nodes selected in Step1.

For each of the nodes selected in Step1 above, use:

 node.ParentNode.RemoveChild(node);

Explanation:

The XPath union operator | produces the union of two node-sets. Therefore the expression yourExpr1 | yourExpr2 ... | yourExprN when applied on the XML document produces the set of all nodes that are selected by any of the N given XPath expressions.
A node $n doesn't belong to a set of nodes $ns exactly when ...

count($n | $ns) > count($ns)