Search code examples
powerbipowerquerypowerbi-desktopdata-cleaningm

Ignoring error in Table.ExpandTableColumn or auto-replace with null


I am loading 100s of XML files from a folder and Power Query has created a function from the first file I had loaded. The problem is not all the XML files have a similar structure. Some have nodes that don't exist in other XML files. I can't change the source system which generates this.

1 XML has TAN and TANREG tags missing.

Company1:

<ENVELOPE>
 <HEADER>
  <VERSION>1</VERSION>
  <STATUS>1</STATUS>
 </HEADER>
 <BODY>
  <DESC>
   <CMPINFO>
    <COMPANY>0</COMPANY>
    <GROUP>0</GROUP>
    <LEDGER>0</LEDGER>
    <COSTCATEGORY>0</COSTCATEGORY>
    <COSTCENTRE>0</COSTCENTRE>
    <GODOWN>0</GODOWN>
    <STOCKGROUP>0</STOCKGROUP>
    <STOCKCATEGORY>0</STOCKCATEGORY>
    <STOCKITEM>0</STOCKITEM>
    <VOUCHERTYPE>0</VOUCHERTYPE>
    <CURRENCY>0</CURRENCY>
    <UNIT>0</UNIT>
    <BUDGET>0</BUDGET>
    <CLIENTRULE>0</CLIENTRULE>
    <SERVERRULE>0</SERVERRULE>
    <STATE>0</STATE>
    <TDSRATE>0</TDSRATE>
    <TAXCLASSIFICATION>0</TAXCLASSIFICATION>
    <STCATEGORY>0</STCATEGORY>
    <DEDUCTEETYPE>0</DEDUCTEETYPE>
    <ATTENDANCETYPE>0</ATTENDANCETYPE>
    <FBTCATEGORY>0</FBTCATEGORY>
    <FBTASSESSEETYPE>0</FBTASSESSEETYPE>
    <TARIFFCLASSIFICATION>0</TARIFFCLASSIFICATION>
    <EXCISEDUTYCLASSIFICATION>0</EXCISEDUTYCLASSIFICATION>
    <SERIALNUMBER>0</SERIALNUMBER>
    <ADJUSTMENTCLASSIFICATION>0</ADJUSTMENTCLASSIFICATION>
    <INCOMETAXSLAB>0</INCOMETAXSLAB>
    <INCOMETAXCLASSIFICATION>0</INCOMETAXCLASSIFICATION>
    <LBTCLASSIFICATION>0</LBTCLASSIFICATION>
    <TAXUNIT>0</TAXUNIT>
    <RETURNMASTER>0</RETURNMASTER>
    <GSTCLASSIFICATION>0</GSTCLASSIFICATION>
    <VOUCHERNUMBERSERIES>0</VOUCHERNUMBERSERIES>
    <VOUCHER>0</VOUCHER>
   </CMPINFO>
  </DESC>
  <DATA>
   <COLLECTION>
    <COMPANY NAME="Company 2" RESERVEDNAME="">
     <PINCODE TYPE="String">100001</PINCODE>
     <INCOMETAXNUMBER TYPE="String">1233456</INCOMETAXNUMBER>
     <COUNTRYNAME TYPE="String">India</COUNTRYNAME>
     <CORPORATEIDENTITYNO TYPE="String">yyyyyyy</CORPORATEIDENTITYNO>
     <STATENAME TYPE="String">yyyyyyy</STATENAME>
     <_NAME TYPE="String">Company 2</_NAME>
    </COMPANY>
   </COLLECTION>
  </DATA>
 </BODY>
</ENVELOPE>

Company2:

<ENVELOPE>
 <HEADER>
  <VERSION>1</VERSION>
  <STATUS>1</STATUS>
 </HEADER>
 <BODY>
  <DESC>
   <CMPINFO>
    <COMPANY>0</COMPANY>
    <GROUP>0</GROUP>
    <LEDGER>0</LEDGER>
    <COSTCATEGORY>0</COSTCATEGORY>
    <COSTCENTRE>0</COSTCENTRE>
    <GODOWN>0</GODOWN>
    <STOCKGROUP>0</STOCKGROUP>
    <STOCKCATEGORY>0</STOCKCATEGORY>
    <STOCKITEM>0</STOCKITEM>
    <VOUCHERTYPE>0</VOUCHERTYPE>
    <CURRENCY>0</CURRENCY>
    <UNIT>0</UNIT>
    <BUDGET>0</BUDGET>
    <CLIENTRULE>0</CLIENTRULE>
    <SERVERRULE>0</SERVERRULE>
    <STATE>0</STATE>
    <TDSRATE>0</TDSRATE>
    <TAXCLASSIFICATION>0</TAXCLASSIFICATION>
    <STCATEGORY>0</STCATEGORY>
    <DEDUCTEETYPE>0</DEDUCTEETYPE>
    <ATTENDANCETYPE>0</ATTENDANCETYPE>
    <FBTCATEGORY>0</FBTCATEGORY>
    <FBTASSESSEETYPE>0</FBTASSESSEETYPE>
    <TARIFFCLASSIFICATION>0</TARIFFCLASSIFICATION>
    <EXCISEDUTYCLASSIFICATION>0</EXCISEDUTYCLASSIFICATION>
    <SERIALNUMBER>0</SERIALNUMBER>
    <ADJUSTMENTCLASSIFICATION>0</ADJUSTMENTCLASSIFICATION>
    <INCOMETAXSLAB>0</INCOMETAXSLAB>
    <INCOMETAXCLASSIFICATION>0</INCOMETAXCLASSIFICATION>
    <LBTCLASSIFICATION>0</LBTCLASSIFICATION>
    <TAXUNIT>0</TAXUNIT>
    <RETURNMASTER>0</RETURNMASTER>
    <GSTCLASSIFICATION>0</GSTCLASSIFICATION>
    <VOUCHERNUMBERSERIES>0</VOUCHERNUMBERSERIES>
    <VOUCHER>0</VOUCHER>
   </CMPINFO>
  </DESC>
  <DATA>
   <COLLECTION>
    <COMPANY NAME="Company 1" RESERVEDNAME="">
     <PINCODE TYPE="String">100001</PINCODE>
     <INCOMETAXNUMBER TYPE="String">ABCDXYZ</INCOMETAXNUMBER>
     <COUNTRYNAME TYPE="String">India</COUNTRYNAME>
     <CORPORATEIDENTITYNO TYPE="String">xxxxxxx</CORPORATEIDENTITYNO>
     <STATENAME TYPE="String">xxxxx</STATENAME>
     <_NAME TYPE="String">Company 1</_NAME>
     <TAN TYPE="String">27ABCDE1234F1ZC</TAN>
     <TANREG TYPE="String">REGULAR</TANREG>
    </COMPANY>
   </COLLECTION>
  </DATA>
 </BODY>
</ENVELOPE>

I have tried to use try catch with Table.ExpandTableColumn, but it is becoming cumbersome to maintain for so many columns.

Is there any other way we can extract the data but as PQ to insert null automatically?

Edit: I have made a sample file with basic load where I am facing the issue. As you will see, Company 1 gives a problem as some fields are missing from its XML that is available in Company 2.

Link to sample Excel with Power Query


Solution

  • Try this for a simple fix.

    enter image description here

    let
        Source = Folder.Files("C:\Users\Dav\Downloads"),
        #"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".xml")),
        #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Custom", each let   Source = Xml.Tables([Content]),
        Table1 = Source{1}[Table],
        Table2 = Table1{1}[Table],
        Table0 = Table2{0}[Table],
        Table3 = Table0{0}[Table],
        #"Expanded PINCODE" = Table.ExpandTableColumn(Table3, "PINCODE", {"Element:Text"}, {"PINCODE"}),
        #"Expanded INCOMETAXNUMBER" = Table.ExpandTableColumn(#"Expanded PINCODE", "INCOMETAXNUMBER", {"Element:Text"}, {"INCOMETAXNUMBER"}),
        #"Expanded COUNTRYNAME" = Table.ExpandTableColumn(#"Expanded INCOMETAXNUMBER", "COUNTRYNAME", {"Element:Text"}, {"COUNTRYNAME"}),
        #"Expanded CORPORATEIDENTITYNO" = Table.ExpandTableColumn(#"Expanded COUNTRYNAME", "CORPORATEIDENTITYNO", {"Element:Text"}, {"CORPORATEIDENTITYNO"}),
        #"Expanded STATENAME" = Table.ExpandTableColumn(#"Expanded CORPORATEIDENTITYNO", "STATENAME", {"Element:Text"}, {"STATENAME"}),
        #"Expanded _NAME" = Table.ExpandTableColumn(#"Expanded STATENAME", "_NAME", {"Element:Text"}, {"NAME"}),
        #"Expanded TAN" = try Table.ExpandTableColumn(#"Expanded _NAME", "TAN", {"Element:Text"}, {"TAN"}) otherwise #"Expanded _NAME",
        #"Expanded TANREG" = try  Table.ExpandTableColumn(#"Expanded TAN", "TANREG", {"Element:Text"}, {"TANREG"}) otherwise #"Expanded _NAME"
    in
        #"Expanded TANREG"),
        #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Name", "Custom"})
    in
        #"Removed Other Columns"