Search code examples
xmlpowershell

Powershell empty XML element formatted to be one line


I need my XML format to be slightly different then the way Powershell saves it by default. Here is a code example:

[xml]$XML = New-Object system.Xml.XmlDocument
$Declaration = $XML.CreateXmlDeclaration("1.0","UTF-8",$null)
$XML.AppendChild($Declaration) > $null

$Temp = $XML.CreateElement('Basket')
$Temp.InnerText = $test
$XML.AppendChild($Temp)

$Temp1 = $XML.CreateElement('Item')
$Temp1.InnerText = ''
$Temp.AppendChild($Temp1)

$XML.save('test.xml')

This results in:

<?xml version="1.0" encoding="UTF-8"?>
<Basket>
  <Item>
  </Item>
</Basket>

My required XML should look the following:

<?xml version="1.0" encoding="UTF-8"?>
<Basket>
  <Item></Item>
</Basket>

Is this possible?

If i add XML.PreserveWhitespace = $true everything will end up on one line. And elements don't have the PreserveWhitespace property.

One solution i found is to add a space $Temp1.InnerText = ' ' and then clean up the code in a second step. But i was wondering if there is a trick to have Powershell output empty elements on one line. Unfortunately my target application that needs to read the XML will only accept the above required formatting.


Solution

  • $Temp1.InnerText = ''
    

    is your attempt to force the XML serialization to use the single-line <tag></tag> form rather than the self-closing <tag /> form, because - even though both forms should be equivalent, the particular consumer of your serialized XML (Adobe) only accepts the <tag></tag> form.

    Your attempt is based on distinguishing between a truly empty element - one that has no child nodes - and one that has an empty-string text child node (implicitly created by .InnerText = ''), in the hopes that an element with child nodes - even though the only child node is the empty string - always serializes in the <tag>...</tag> form.

    Your attempt:

    • isn't honored by the XmlDocument type's .Save() method (which prompted your question)

      • The specific behavior you've encountered was ultimately classified as by design - see GitHub issue #31079.
    • is honored by the LINQ-based XDocument type's, .Save() method.

    Therefore, you have two options:


    Workaround, if you're given an existing XmlDocument instance:

    If you construct an XDocument instance from the (non-pretty-printed) XML string returned by your XmlDocument instance's .OuterXml property ($XML.OuterXml), saving the resulting XDocument instance to a file uses the desired <tag></tag> form, assuming you kept the explicitly added empty-string child text node in your code, i.e, $Temp1.InnerText = '':

    # Creates a pretty-printed XML file with the empty elements
    # represented in "<tag></tag>" form from the System.Xml.XmlDocument
    # instance stored in `$XML`.
    ([System.Xml.Linq.XDocument] $XML.OuterXml).Save("$PWD/test.xml")
    

    While this involves an extra round of serialization and parsing, it is a simple and pragmatic solution.

    If the XML DOM object wasn't given to you and you have the option to construct it yourself, it's better to construct it as an XDocument instance to begin with, as shown next.


    Alternatively, you can construct your XML document directly as an XmlDocument:

    Constructing your document as an XDocument instance to begin with:

    # PSv5+ syntax for simplifying type references.
    using namespace System.Xml.Linq
    
    # Create the XDocument with its  declaration.
    $xd = [XDocument]::new(
            # Note: [NullString]::Value is needed to pass a true null value - $null doesn't wor.
            [XDeclaration]::new('1.0', 'UTF-8', [NullString]::Value)
          )
    
    # Add nodes.
    $xd.Add(($basket = [XElement] [XName] 'Basket'))
    $basket.Add(($item = [XElement] [XName] 'Item'))
    
    # Add an empty-string child node to the '<Item>' element to
    # force it to serialize as '<Item></Item>' rather than as '<Item />'
    $item.SetValue('')
    
    # Save the document to a file.
    $xd.Save("$PWD/test.xml")