Search code examples
excelxmlvbaxfdf

How to access specific value in imported XML with VBA


I have a completed .pdf form (XFDF, XML) that I am importing into excel. The goal is to get a "Field Name" and a "Value" for that field from the XML.

Currently I am able to get the "Value" (Blue accent in picture below) by using a simple command such as:

 list(4).text        

I am struggling on how to store the "Field Name". In the below picture it is the expression/value highlighted in yellow.

I have tried variations of the following but can't get it to work.

List(4).Attributes(1).value

Any tips on how to access the nested value?

List -> Item(4) -> Attributes -> Item(1) -> value

Code and XML List Image

XFDF RAW:

<?xml version="1.0" encoding="utf-16"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
  <fields>
    <field name="3C_AQS_1">
      <value>2</value>
    </field>
    <field name="3C_AQS_2">
      <value>3</value>
    </field>
    <field name="3C_AQS_3">
      <value>1</value>
    </field>
    <field name="3C_AYS_1">
      <value>D:23440101</value>
    </field>
    <field name="3C_AYS_2">
      <value>D:13240101</value>
    </field>
    <field name="3C_AYS_3">
      <value>D:12340101</value>
    </field>
    <field name="3C_EEL_1">
      <value>2</value>
    </field>
    <field name="3C_EEL_2">
      <value>5</value>
    </field>
    <field name="3C_EEL_3">
      <value>2</value>
    </field>
    <field name="3C_JobProfile_1">
      <value>1235</value>
    </field>
    <field name="3C_JobProfile_2">
      <value>547</value>
    </field>
    <field name="3C_JobProfile_3">
      <value>1234 fd</value>
    </field>
    <field name="4_PME_DBT">
      <value>yuktyu</value>
    </field>
    <field name="4_PME_Launch">
      <value>ew4tw</value>
    </field>
    <field name="4_PME_Scope">
      <value>serg</value>
    </field>
    <field name="4_PPT_End">
      <value>D:19930714</value>
    </field>
    <field name="4_PPT_Start">
      <value>D:19001201</value>
    </field>
    <field name="Capital_LifeAmount">
      <value>5352</value>
    </field>
    <field name="Capital_Yr">
      <value>1</value>
    </field>
    <field name="Capital_YrAmount">
      <value>123124</value>
    </field>
    <field name="Capital_YrAmountP">
      <value>234</value>
    </field>
    <field name="CostConfidence">
      <value>Med</value>
    </field>
    <field name="Devliverables_1">
      <value>Delev 1</value>
    </field>
    <field name="Devliverables_2">
      <value>Delev 2</value>
    </field>
    <field name="Devliverables_3">
      <value>Delev 3</value>
    </field>
    <field name="Devliverables_4">
      <value>Delev 4</value>
    </field>
    <field name="Devliverables_5">
      <value>Delev 5</value>
    </field>
    <field name="HR Initiative">
      <value>HrInit</value>
    </field>
    <field name="InScope_1">
      <value>InScope 1</value>
    </field>
    <field name="InScope_2">
      <value>InScope 2</value>
    </field>
    <field name="InScope_3">
      <value>InScope 3</value>
    </field>
    <field name="InScope_4">
      <value>InScope 4</value>
    </field>
    <field name="InScope_5">
      <value>InScope 5</value>
    </field>
    <field name="In_HR_Area_1">
      <value>University Relations</value>
    </field>
    <field name="In_HR_Area_2">
      <value>Talent Development</value>
    </field>
    <field name="In_HR_Area_3">
      <value>Change Management</value>
    </field>
    <field name="In_HR_Area_4">
      <value>Global Compensation </value>
    </field>
    <field name="In_HR_Area_5">
      <value>Career Framework</value>
    </field>
    <field name="In_HR_NumEmp_1">
      <value>1</value>
    </field>
    <field name="In_HR_NumEmp_2">
      <value>42345</value>
    </field>
    <field name="In_HR_NumEmp_3">
      <value>12</value>
    </field>
    <field name="In_HR_NumEmp_4">
      <value>12</value>
    </field>
    <field name="In_HR_NumEmp_5">
      <value>11</value>
    </field>
    <field name="In_HR_NumMonth_1">
      <value>1</value>
    </field>
    <field name="In_HR_NumMonth_2">
      <value>3</value>
    </field>
    <field name="In_HR_NumMonth_3">
      <value>4</value>
    </field>
    <field name="In_HR_NumMonth_4">
      <value>5</value>
    </field>
    <field name="In_HR_NumMonth_5">
      <value>1</value>
    </field>
    <field name="In_HR_PerTime_1">
      <value>2</value>
    </field>
    <field name="In_HR_PerTime_2">
      <value>5</value>
    </field>
    <field name="In_HR_PerTime_3">
      <value>2</value>
    </field>
    <field name="In_HR_PerTime_4">
      <value>11</value>
    </field>
    <field name="In_HR_PerTime_5">
      <value>03</value>
    </field>
    <field name="Opex_LifeAmount">
      <value>23424</value>
    </field>
    <field name="Opex_Yr">
      <value>12</value>
    </field>
    <field name="Opex_YrAmount">
      <value>123123</value>
    </field>
    <field name="Opex_YrAmountP">
      <value>345235</value>
    </field>
    <field name="OutScope_1">
      <value>outScope 1</value>
    </field>
    <field name="OutScope_2">
      <value>outScope 2</value>
    </field>
    <field name="OutScope_3">
      <value>outScope 3</value>
    </field>
    <field name="OutScope_4">
      <value>outScope 4</value>
    </field>
    <field name="OutScope_5">
      <value>outScope 5</value>
    </field>
    <field name="Out_HR_Area_1">
      <value>sdfsg</value>
    </field>
    <field name="Out_HR_Area_2">
      <value>dffnjgmkgm,k</value>
    </field>
    <field name="Out_HR_Area_3">
      <value>3453wygh</value>
    </field>
    <field name="Out_HR_Area_4">
      <value>235hc bn</value>
    </field>
    <field name="Out_HR_Area_5">
      <value>234dfg435</value>
    </field>
    <field name="Out_HR_NumEmp_1">
      <value>1</value>
    </field>
    <field name="Out_HR_NumEmp_2">
      <value>6</value>
    </field>
    <field name="Out_HR_NumEmp_3">
      <value>34</value>
    </field>
    <field name="Out_HR_NumEmp_4">
      <value>2</value>
    </field>
    <field name="Out_HR_NumEmp_5">
      <value>78</value>
    </field>
    <field name="Out_HR_NumMonth_1">
      <value>23</value>
    </field>
    <field name="Out_HR_NumMonth_2">
      <value>23</value>
    </field>
    <field name="Out_HR_NumMonth_3">
      <value>1</value>
    </field>
    <field name="Out_HR_NumMonth_4">
      <value>235</value>
    </field>
    <field name="Out_HR_NumMonth_5">
      <value>52</value>
    </field>
    <field name="Out_HR_PerTime_1">
      <value>3</value>
    </field>
    <field name="Out_HR_PerTime_2">
      <value>1</value>
    </field>
    <field name="Out_HR_PerTime_3">
      <value>11</value>
    </field>
    <field name="Out_HR_PerTime_4">
      <value>1</value>
    </field>
    <field name="Out_HR_PerTime_5">
      <value>1</value>
    </field>
    <field name="Problem Statement">
      <value>Prob Statemewnt dghdheh</value>
    </field>
    <field name="ProjSponName_First">
      <value>ProjSponFirst</value>
    </field>
    <field name="ProjSponName_Last">
      <value>ProjSponLast</value>
    </field>
    <field name="Project Description">
      <value>Project Desc</value>
    </field>
    <field name="Project Name">
      <value>ProjName</value>
    </field>
    <field name="ProposedMeasure_1">
      <value>Prop meas 1 </value>
    </field>
    <field name="ProposedMeasure_2">
      <value>prop meas 2</value>
    </field>
    <field name="ProposedMeasure_3">
      <value>prop meas 3 </value>
    </field>
    <field name="ProposedMeasure_4">
      <value>prop meas 4</value>
    </field>
    <field name="ProposedMeasure_5">
      <value>prop meas 5</value>
    </field>
    <field name="Savings_Yr">
      <value>2</value>
    </field>
    <field name="Savings_YrAmount">
      <value>12313</value>
    </field>
    <field name="Savings_YrAmountP">
      <value>234254</value>
    </field>
    <field name="Submit" />
    <field name="SubmitByName_First">
      <value>SubNameFirst</value>
    </field>
    <field name="SubmitByName_Last">
      <value>SubNameLast</value>
    </field>
    <field name="TempHeadCount">
      <value>2</value>
    </field>
    <field name="TempMonths">
      <value>4</value>
    </field>
  </fields>
  <ids original="FA098893AF1E1140A46335ED9EEEA5CE" modified="A3D8A6418EFE9E439F3CADCF3350958D" />
</xfdf>

EXCEL VBA RAW:

Sub AddFormData()    

    Dim FName As String, FD As FileDialog
    Dim WApp As Object, WDoc As Object, WDR As Object
    Dim ExR As Range    

    Set FD = Application.FileDialog(msoFileDialogOpen)
    FD.Filters.Add "PDF FORM DATA", "*.xfdf", 1
    FD.Show

    If FD.SelectedItems.Count <> 0 Then        
        For yi = 1 To FD.SelectedItems.Count
            FName = FD.SelectedItems(yi)

            '''''''''''''''''''''''''''''''''''''''''
            Set oXMLFile = CreateObject("MSXML2.DOMDocument") '''“Microsoft.XMLDOM”)
            XMLFileName = FName
            oXMLFile.Load (XMLFileName)

            Set List = oXMLFile.SelectNodes("//fields/")    

            ''''''''''''''''''''''''''''''''''''''''

            '''''''''''''''''''''''''''''''''''''
        Next yi    
    End If    
End Sub

Solution

  • Consider XPath, a more expressive parsing tool than walking down .Item, .ChildNodes, .Attributes, and other element collections. Also, you have a default namespace, xmlns="...", in your XML document. To resolve, assign a colon-separated prefix like pdf to default namespace to access all nodes in its scope. Below is a generalized version to walk down nested levels of tree:

    Set oXMLFile = CreateObject("MSXML2.DOMDocument") '''“Microsoft.XMLDOM”)
    XMLFileName = FName
    oXMLFile.Load XMLFileName
    
    ' TEMPORARILY DEFINE PREFIX FOR DEFAULT NAMESPACE
    oXMLFile.setProperty "SelectionNamespaces", "xmlns:pdf='http://ns.adobe.com/xfdf/'"
    
    ' PRINT ALL THIRD NESTED LEVEL NODE VALUES
    Set myList = oXMLFile.SelectNodes("/*/*/*")
    For Each var In myList
        Debug.Print var.Text
    Next var
    ' 2
    ' 3
    ' 1
    ' D:23440101
    ' D:13240101
    ' D:12340101
    
    ' PRINT ALL THIRD NESTED LEVEL ATTRIBUTES VALUES
    Dim var as Variant
    ...
    Set myList = oXMLFile.SelectNodes("/*/*/*/@*")
    For Each var In myList
        Debug.Print var.Text
    Next var
    ' 3 C_AQS_1
    ' 3 C_AQS_2
    ' 3 C_AQS_3
    ' 3 C_AYS_1
    ' 3 C_AYS_2
    ' 3 C_AYS_3
    

    Should you need to access indexed values, you can still use XPath for node or attribute value

    ' PRINT INDEXED NODE AND ATTRIBUTE
    Set myList = oXMLFile.SelectNodes("/*/*/*")         ' MOVE TO SECOND NESTED LEVEL
    Debug.Print myList(0).SelectSingleNode("*").Text
    Debug.Print myList(0).SelectSingleNode("@*").Text
    ' 2
    ' 3 C_AQS_1
    
    Debug.Print myList(1).SelectSingleNode("*").Text
    Debug.Print myList(1).SelectSingleNode("@*").Text
    ' 3
    ' 3 C_AQS_2
    
    Debug.Print myList(2).SelectSingleNode("*").Text
    Debug.Print myList(2).SelectSingleNode("@*").Text
    ' 1
    ' 3 C_AQS_3
    

    And in loop

    Dim var as Variant
    ...
    For Each var In myList
        If Not var.SelectSingleNode("*") Is Nothing Then
            Debug.Print var.SelectSingleNode("*").Text
            Debug.Print var.SelectSingleNode("@*").Text
        End If
    Next var