Search code examples
xmlpowershellcsvxml-parsingpattern-matching

Null .csv files when extracting node data from .xml files


Trying this again. Ok so I have a powershell script that processes .xml files in a directory. For each .xml file the script processes it is looking for every instance of <AuditRecord> and extracts all data within each node.

The challenge I am having when I try to write results to a .csv file it comes up empty. No headers, no data, absolutely nothing. I've been racking my brain over why it won't simply write the data into the .csv file

Update 07.31.2024: Incorporated recent suggestions. Code sample updated - no errors: but it still generates a .csv file with no headers or data in it (blank)

Update 07.31.2024: Refactored code. This time it doesn't generate any .csv file. This is very discouraging, don't know what to do at this point. I can fix the params to read the elements correctly as suggested, but I can't even generate a simple .csv with content. Any help would be sorely appreciated.

CODE

    cls
    
    $inPath = "C:\deathstar\test\in\"
    $outPath = "C:\deathstar\test\in\"
    
    # Create the 'out' directory if it doesn't exist
    if (-not (Test-Path $outPath)) {
        New-Item -ItemType Directory -Path $outPath | Out-Null
    }
    
    # Get all XML files in the 'in' directory
    $xmlFiles = Get-ChildItem $inPath -Filter *.xml
    
    foreach ($file in $xmlFiles) {
        [xml]$xmlContent = Get-Content $file.FullName
    
        $auditRecord = $xmlContent.AuditRecord
    
        if ($auditRecord -ne $null) {
            $outputData = [ordered]@{
                AuditRecordID = $auditRecord.AuditRecordID
                UserID = $auditRecord.PIUser.Name
                UTCSeconds = $auditRecord.PITime.UTCSeconds
                LocalDate = $auditRecord.PITime.LocalDate
                PIModuleAction = $auditRecord.PIConfigurationDB.PIModules.Action
                PIModuleUID = ($auditRecord.PIConfigurationDB.PIModules.PIModule.UniqueID -join ", ")
                PIModuleName = $auditRecord.PIConfigurationDB.PIModules.PIModule.Name
                PIModuleAttribName = ($auditRecord.PIConfigurationDB.PIModules.PIModule.PIModuleAttributes.PIModuleAttribute.Name -join ", ")
                PIModValueBefore = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIModuleAttributes.PIModuleAttribute.Value.Before.Type[0]
                PIModValueAfter = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIModuleAttributes.PIModuleAttribute.Value.After.Type[0]
                PIModAttribName = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIModuleAttributes.PIModuleAttribute.Name[1]
                PIModAttribValBefore = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIModuleAttributes.PIModuleAttribute.Value.Before.Type[1]
                PIModAttribValAfter = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIModuleAttributes.PIModuleAttribute.Value.After.Type[1]
                PIPropertyName = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIProperties.PIProperty.Name
                PIPropertyAction = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIProperties.PIProperty.Action
                PIPropertyUNCName = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIProperties.PIProperty.ParentUNC_Name
                PIPropValBefore = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIProperties.PIProperty.Value.Before.Type
                PIPropertyValAfter = $auditRecord.PIConfigurationDB.PIModules.PIModule.PIProperties.PIProperty.Value.After.Type
            }
    
            $csvFileName = "$outPath\$($file.BaseName).csv"
            $outputData | Export-Csv -Path $csvFileName -NoTypeInformation
        }
    }

.XML FILE SNIPPET

<?xml version="1.0" encoding="UTF-8" ?>
<PIAudit
  xmlns="xml.osisoft.com-schemas-piaudit"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="xml.osisoft.com-schemas-piaudit http://xml.osisoft.com/Schemas/PIAudit">
  <PIServer IPHost="DEATH-STAR" IPAddress="192.168.1.1"/>
  <ExportDate UTCSeconds="1722272370" LocalDate="2024-07-29T12:59:30-04:00"/>
  <OSUser Domain="RX" Name="dvadar" />
  <AuditRecords ExportFileName="C:\Batch\1\input\pibasessAudit.dat3_Apr_24_17_24_10_16502" RecordCount="420" AuditFileName="D:\PI\log\pibasessAudit.dat" CreationDate="2024-03-28T17:47:42-04:00" Mask="ffffffffh">
    <AuditRecord AuditRecordID="b066447a-20de-4659-8e5e-41ecb27d9d39">
      <PIUser UserID="1" Name="piadmin"/>
      <PITime UTCSeconds="1711662501" LocalDate="2024-03-28T17:48:21-04:00" />
      <PIConfigurationDB>
        <PIModules Action="Edit">
          <PIModule UniqueID="feea9e80-3d3f-4f45-b58d-275e4845bcde, 31-Dec-69 16:00:01" Name="Database">
            <PIModuleAttributes>
              <PIModuleAttribute Name="ModifyDate">
                <Value>
                  <Before Type="xs:dateTime">2024-03-28T17:47:42-04:00</Before>
                  <After Type="xs:dateTime">2024-03-28T17:48:21-04:00</After>
                </Value>
              </PIModuleAttribute>
              <PIModuleAttribute Name="Revision">
                <Value>
                  <Before Type="xs:long">35738</Before>
                  <After Type="xs:long">35739</After>
                </Value>
              </PIModuleAttribute>
            </PIModuleAttributes>
            <PIProperties>
              <PIProperty Name="TemplateDataSet" Action="Edit" ParentUNC_Name="\\PIProperties">
                <Value>
                  <Before Type="xs:hexBinary"></Before>
                  <After Type="xs:hexBinary"></After>
                </Value>
              </PIProperty>
            </PIProperties>
          </PIModule>
        </PIModules>
      </PIConfigurationDB>
    </AuditRecord>
    <AuditRecord AuditRecordID="d6588207-7ab2-4a4f-bfba-f62621a2cae2">
      <PIUser UserID="1" Name="piadmin"/>
      <PITime UTCSeconds="1711662705" LocalDate="2024-03-28T17:51:45-04:00" />
      <PIConfigurationDB>
        <PIModules Action="Edit">
          <PIModule UniqueID="feea9e80-3d3f-4f45-b58d-275e4845bcde, 31-Dec-69 16:00:01" Name="Database">
            <PIModuleAttributes>
              <PIModuleAttribute Name="ModifyDate">
                <Value>
                  <Before Type="xs:dateTime">2024-03-28T17:48:21-04:00</Before>
                  <After Type="xs:dateTime">2024-03-28T17:51:45-04:00</After>
                </Value>
              </PIModuleAttribute>
              <PIModuleAttribute Name="Revision">
                <Value>
                  <Before Type="xs:long">35739</Before>
                  <After Type="xs:long">35740</After>
                </Value>
              </PIModuleAttribute>
            </PIModuleAttributes>
            <PIProperties>
              <PIProperty Name="TemplateDataSet" Action="Edit" ParentUNC_Name="\\PIProperties">
                <Value>
                  <Before Type="xs:hexBinary"></Before>
                  <After Type="xs:hexBinary"></After>
                </Value>
              </PIProperty>
            </PIProperties>
          </PIModule>
        </PIModules>
      </PIConfigurationDB>
    </AuditRecord>
    <AuditRecord AuditRecordID="ff041b3b-ca18-4909-a16a-18444d8b17b9">
      <PIUser UserID="1" Name="piadmin"/>
      <PITime UTCSeconds="1711669167" LocalDate="2024-03-28T19:39:27-04:00" />
      <PIConfigurationDB>
        <PIModules Action="Edit">
          <PIModule UniqueID="feea9e80-3d3f-4f45-b58d-275e4845bcde, 31-Dec-69 16:00:01" Name="Database">
            <PIModuleAttributes>
              <PIModuleAttribute Name="ModifyDate">
                <Value>
                  <Before Type="xs:dateTime">2024-03-28T17:51:45-04:00</Before>
                  <After Type="xs:dateTime">2024-03-28T19:39:27-04:00</After>
                </Value>
              </PIModuleAttribute>
              <PIModuleAttribute Name="Revision">
                <Value>
                  <Before Type="xs:long">35740</Before>
                  <After Type="xs:long">35741</After>
                </Value>
              </PIModuleAttribute>
            </PIModuleAttributes>
            <PIProperties>
              <PIProperty Name="TemplateDataSet" Action="Edit" ParentUNC_Name="\\PIProperties">
                <Value>
                  <Before Type="xs:hexBinary"></Before>
                  <After Type="xs:hexBinary"></After>
                </Value>
              </PIProperty>
            </PIProperties>
          </PIModule>
        </PIModules>
      </PIConfigurationDB>
    </AuditRecord>
  </AuditRecords>
</PIAudit>

DESIRED OUTPUT

enter image description here


Solution

  • Consider XSLT, the special-purpose, industry-wide language designed to transform XML files into other XML, HTML, and even text files like CSVs! PowerShell can run XSLT with built-in System.Xml.Xsl or other, third-party XSLT processors (Saxon, Xalan, etc.) which can run higher versions 2.0 and 3.0.

    Specifically, below XSLT 1.0 will walk through every descendant element and attribute of root (assumed to be <AuditRecord>) and extracts underlying text values in comma separation. To create unique headers, ancestor elements three levels up are concatenated together with underscore separators. For elements that do not have such ancestor, underscores are stretched out with no values between. Also, some elements do not have any text and hence return empty cells. Consider cleaning this up later on!

    XSLT (save as .xsl file, a special .xml file)

    <xsl:stylesheet version="1.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output method="text" indent="yes"/>
        <xsl:strip-space elements="*"/>
    
        <xsl:template match="/AuditRecord">
            <!-- HEADERS -->
            <xsl:for-each select="descendant::*/@* | descendant::*">
                <xsl:value-of select="concat(name(../../..), '_', name(../..), '_', ../@Name, '_', name(..), '_', name())"/>
                <xsl:if test="position() != last()">
                    <xsl:text>,</xsl:text>
                </xsl:if>
                <xsl:if test="position() = last()">
                    <xsl:text>&#xa;</xsl:text>
                </xsl:if>
            </xsl:for-each>
            <!-- ROWS -->
            <xsl:apply-templates select="descendant::*"/>
        </xsl:template>
    
        <!-- PARSE ELEMENTS -->
        <xsl:template match="*">
            <xsl:apply-templates select="@*"/>
            <xsl:value-of select="text()"/>
            <xsl:if test="position() != last()">
                <xsl:text>,</xsl:text>
            </xsl:if>
            <xsl:if test="position() = last()">
                <xsl:text>&#xa;</xsl:text>
            </xsl:if>
        </xsl:template>
    
        <!-- PARSE ATTRIBUTES -->
        <xsl:template match="@*">
            <xsl:value-of select="."/>
            <xsl:if test="position() != last()">
                <xsl:text>,</xsl:text>
            </xsl:if>
        </xsl:template>
    </xsl:stylesheet>
    

    Output (CSV shown in markdown table version)

    | ___AuditRecord_PIUser | _AuditRecord_piadmin_PIUser_UserID | _AuditRecord_piadmin_PIUser_Name | ___AuditRecord_PITime     | _AuditRecord__PITime_UTCSeconds | _AuditRecord__PITime_LocalDate | ___AuditRecord_PIConfigurationDB     | _AuditRecord__PIConfigurationDB_PIModules | AuditRecord_PIConfigurationDB__PIModules_Action | AuditRecord_PIConfigurationDB__PIModules_PIModule | PIConfigurationDB_PIModules_Database_PIModule_UniqueID | PIConfigurationDB_PIModules_Database_PIModule_Name | PIConfigurationDB_PIModules_Database_PIModule_PIModuleAttributes | PIModules_PIModule__PIModuleAttributes_PIModuleAttribute | PIModule_PIModuleAttributes_ModifyDate_PIModuleAttribute_Name | PIModule_PIModuleAttributes_ModifyDate_PIModuleAttribute_Value | PIModuleAttributes_PIModuleAttribute__Value_Before | PIModuleAttribute_Value__Before_Type | PIModuleAttributes_PIModuleAttribute__Value_After | PIModuleAttribute_Value__After_Type | PIModules_PIModule__PIModuleAttributes_PIModuleAttribute | PIModule_PIModuleAttributes_Revision_PIModuleAttribute_Name | PIModule_PIModuleAttributes_Revision_PIModuleAttribute_Value | PIModuleAttributes_PIModuleAttribute__Value_Before | PIModuleAttribute_Value__Before_Type | PIModuleAttributes_PIModuleAttribute__Value_After | PIModuleAttribute_Value__After_Type | PIConfigurationDB_PIModules_Database_PIModule_PIProperties | PIModules_PIModule__PIProperties_PIProperty | PIModule_PIProperties_TemplateDataSet_PIProperty_Name | PIModule_PIProperties_TemplateDataSet_PIProperty_Action | PIModule_PIProperties_TemplateDataSet_PIProperty_ParentUNC_Name | PIModule_PIProperties_TemplateDataSet_PIProperty_Value | PIProperties_PIProperty__Value_Before | PIProperty_Value__Before_Type | PIProperties_PIProperty__Value_After | PIProperty_Value__After_Type |
    |-----------------------|------------------------------------|----------------------------------|---------------------------|---------------------------------|--------------------------------|--------------------------------------|-------------------------------------------|-------------------------------------------------|---------------------------------------------------|--------------------------------------------------------|----------------------------------------------------|------------------------------------------------------------------|----------------------------------------------------------|---------------------------------------------------------------|----------------------------------------------------------------|----------------------------------------------------|--------------------------------------|---------------------------------------------------|-------------------------------------|----------------------------------------------------------|-------------------------------------------------------------|--------------------------------------------------------------|----------------------------------------------------|--------------------------------------|---------------------------------------------------|-------------------------------------|------------------------------------------------------------|---------------------------------------------|-------------------------------------------------------|---------------------------------------------------------|-----------------------------------------------------------------|--------------------------------------------------------|---------------------------------------|-------------------------------|--------------------------------------|------------------------------|
    | 1                     | piadmin                            | 1711662501                       | 2024-03-28T17:48:21-04:00 |                                 | Edit                           | feea9e80-3d3f-4f45-b58d-275e4845bcde | 31-Dec-69 16:00:01                        | Database                                        |                                                   | ModifyDate                                             |                                                    | xs:dateTime2024-03-28T17:47:42-04:00                             | xs:dateTime2024-03-28T17:48:21-04:00                     | Revision                                                      |                                                                | xs:long35738                                       | xs:long35739                         |                                                   | TemplateDataSet                     | Edit                                                     | \\PIProperties                                              |                                                              | xs:hexBinary                                       | xs:hexBinary                         |                                                   |                                     |                                                            |                                             |                                                       |                                                         |                                                                 |                                                        |                                       |                               |                                      |                              |
    

    PowerShell

    # Define the directory containing the XML files
    $directory = "C:\test\"
    
    # Get all XML files in the directory
    $xmlFiles = Get-ChildItem -Path $directory -Filter *.xml
    
    # Load XSLT
    $xslt = New-Object System.Xml.Xsl.XslCompiledTransform;
    $xslt.Load("C:\path\to\style.xsl");
    
    # Loop through each XML file
    foreach ($file in $xmlFiles) {
        # Define the output CSV file path
        $csvFilePath = Join-Path -Path $directory -ChildPath ($file.BaseName + ".csv")
    
        # Run transformation (XML -> CSV)
        $xslt.Transform($file.FullName, $csvFilePath);
    }