Search code examples
xmlpowershell

Modify OuterXml of a given node to keep it in file, but render node useless


I'm working on modifying xml files from an existing system using PowerShell. Our system supports using multiple nodes in one body, and depending on what the OuterXML is set as, it will bypass that value and use the other node of similar value (I suppose this may be true of all XML files but I do not want to make poor Assumptions).

Consider this XML file example. The way to modify our files to stop the use of the specific node is to prepend the node's OuterXml with an x:

<Configuration>
  <MainBlock>
    <Object NO="01" REVISION="255">
       <CODE TRANSPARENT="TRUE" />
    </Object>
    <Object NO="02" REVISION="255">
       <CODE TRANSPARENT="TRUE" />
    </Object>
    <Object NO="03" REVISION="255">
       <CODE TRANSPARENT="TRUE" />
    </Object>
    <Object NO="04" REVISION="255">
       <CODE TRANSPARENT="TRUE" />
    </Object>
    <!--Note - this line is the line I wish to modify so it is no longer in use.  The example below was edited manually-->
    <xObject NO="05" REVISION="255">
       <CODE TRANSPARENT="TRUE" />
    </xObject>
    <!--By prepending the "OuterXML" of the node above to "xObject" instead of "Object" I can keep the original values there, but use the "Object" below with a different code - this is an extremely simplified version of this concept.-->
    <Object NO="05" REVISION="255">
       <CODE TRANSPARENT="FALSE" />
    </Object>
  </MainBlock>
</Configuration>

The problem I'm having specifically is modifying the OuterXml of a given node, since OuterXml is a read-only attribute . . . When I see the nodes using $nodes = $Xml.SelectNodes to gather the Objects, I return a count of 5 (as expected with this solution), but how on earth do I modify the OuterXml of the the xObject node to read as Object and the Object Node below it to read as xObject? Here's my attempt:

  $Xml=[XML] (Get-Content -Path C:\FooBar.xml)
  
  $nodes=$Xml.SelectNodes("//Configuration/MainBlock/Object")
  $node = $nodes.Item(4).OuterXml #<< Here I can set the $node value as a STRING but not an actual XML element, but the $node element DOES contain the contents of the OuterXml Content.
  $nodes.Item(4) = $node #This does not error out but it also does NOTHING to the existing XML content
  
  $Xml.Save("C:\FooBar.xml")

Understand - there about 25 different attributes / values for each Object node within the actual XML structure I'm dealing with - this sample is just to illustrate the concept quickly/simply. While in the example above, I could easily just modify the Transparent value to "True" - that's not going feasible for me to accomplish on our files.

I know I can use the -replace command to replace the "Object" with "xObject" using PowerShell from a purely "Find and Replace" perspective, but I'm trying to use PowerShell's XML functions to their best potential.

  #While this works, it's kludgy and isn't fool proof.
  $XmlFile=(Get-Content -Path C:\FooBar.xml)
  $TextToChange=$XmlFile -replace 'xObject','fooObject'; -replace 'Object NO=\"5','xObject NO=\"5'; -replace 'fooObject','Object'
  Set-Content -Path C:\FooBar.xml -Value $TextToChange

Solution

    • You're looking for a way to directly rename an XML element, but this is not supported in the [xml] (System.Xml.XmlDocument)-related APIs - and, if I were to guess, also not in most, if not all, other XML APIs.

    • That $nodes.Item(4) = $node is a quiet no-op rather than an error (which it should be, because .Item() is a method here, so it cannot be assigned to) is an unfortunate PowerShell bug:


    Workarounds:

    • If you don't mind working with different APIs, from the System.Xml.Linq namespace, see jdweng's answer.

      • Note:
        • The System.Xml.Linq.XDocument-focused APIs are the modern successor to the System.Xml.XmlDocument ([xml]) APIs and are unquestionably easier to work with in C# and, for _structural DOM manipulations - such as in this case - possibly also in PowerShell. However, unlike for [xml] , PowerShell offers no syntactic sugar (see below) for System.Xml.Linq.XDocument, so learning the specifics of the latter APIs is a must.

        • By contrast, in PowerShell, reading an XML DOM as well as making non-structural modifications is usually easier, due to PowerShell's close integration with the [xml] APIs that allows treating an XML DOM like an object graph, with elements and attributes accessible as properties (see this answer for background information). In simple cases, no API-specific knowledge is required, allowing for a unified OOP approach.

    • A solution based on the [xml] APIs you're using is possible too:

      • As implied by the notes above, structural modification is a bit more cumbersome, and to emulate renaming you have to take the following steps:

        • Create a new element with the desired new name.
        • Move the original element's attributes and child elements to the new element.
        • Insert the new element after the original one.
        • Remove the original element.
      • Aside from the unavoidable use of the [xml]-related methods for the above, the rest of the code takes advantage of the aforementioned OOP-style PowerShell integration, aka PowerShell's adaptation of the [xml] DOM.

    # Load the XML file into an [xml] (System.Xml.XmlDocument) DOM.
    # Note: 
    #  * The form below is more robust than [xml] (Get-Content -Path C:\FooBar.xml)
    #  * Be sure to use a *full* file path in .NET methods, because .NET's working
    #    dir. usually differs from PowerShell's; use Convert-Path with a relative
    #    path to get a full one, e.g., Convert-Path FooBar.xml
    $fullFilePath = 'C:\FooBar.xml'
    ($xml = [xml]::new()).Load($fullFilePath)
    
    # Get the target elements by their 'NO' attribute value, 
    # both the <xObject> and the <Object> element.
    [array] $targetElements = 
      $xml.Configuration.MainBlock.ChildNodes |
      Where-Object NO -EQ '05'
    
    if ($targetElements.Count -eq 0) { throw "No matching element(s) found." }
    
    # Determine the parent element.
    $parentElement = $targetElements[0].ParentNode
    
    # Construct the replacement elements.
    $replacementElements = 
      $targetElements |
      ForEach-Object {
        # Change 'xObject' to 'Object' and vice versa.
        $newName = if ($_.Name -clike 'x*') { $_.Name.Substring(1) } else { 'x' + $_.Name }
        $replacementElement = $xml.CreateElement($newName, $_.NamespaceURI)
        # Move all attributes from the target node to the replacement...
        $null = foreach ($a in @($_.Attributes)) { $replacementElement.Attributes.Append($a) }
        # ... and all child elements.
        $null = foreach ($e in @($_.ChildNodes)) { $replacementElement.AppendChild($e) }
        # Output the newly constructed replacement element.
        $replacementElement
      } 
    
    # Now insert the replacement elements, each right after its original.
    $i = 0
    $null = 
      $replacementElements | ForEach-Object { 
        $parentElement.InsertAfter($_, $targetElements[$i++])
      }
    # ... and remove the original ones.
    $null = $targetElements | ForEach-Object { $parentElement.RemoveChild($_) }
    
    # Save back to the input file.
    $xml.Save($fullFilePath)