Search code examples
xpathgoogle-sheetsarray-formulasgoogle-sheets-formula

How to Import All Element Nodes in Google Sheets from XML


Is there a way to download a full XML schema including headers into a Google Sheet worksheet?

I'm looking to import all elements including the headers from an online XML page into Google Sheets.

The XML page looks like this:

<staff>
    <admin>
        <first-name>Patrice</first-name>
        <family-name>Withers</family-name>
        <year-started>2006</year-started>
        <starting-salary>30,500</starting-salary>
        <current-salary>34,000</current-salary>
    </admin>
    <admin>
        <first-name>Shelly</first-name>
        <family-name>Lancer</family-name>
        <year-started>2015</year-started>
        <starting-salary>32,500</starting-salary>
        <current-salary>33,500</current-salary>
    </admin>
</staff>
<students>
    <full-time>
        <first-name>Henry</first-name>
        <family-name>Nunes</family-name>
        <status>Freshman</status>
        <current-tuition>2,400</current-tuition>
        <last-year-tuition>2,200</last-year-tuition>
    </full-time>
    <part-time>
        <first-name>Leslie</first-name>
        <family-name>Franks</family-name>
        <status>Senior</status>
        <current-tuition>2,300</current-tuition>
        <last-year-tuition>2,100</last-year-tuition>
    </part-time>
</students>

I'm trying to have Google Sheets download the full XML page, including the element headers, and display it, preferably as above in indented columns, or in a single column to look like this:

<staff>
<admin>
<first-name>Patrice</first-name>
<family-name>Withers</family-name>
<year-started>2006</year-started>
<starting-salary>30,500</starting-salary>
<current-salary>34,000</current-salary>
</admin>
<admin>
<first-name>Shelly</first-name>
<family-name>Lancer</family-name>
<year-started>2015</year-started>
<starting-salary>32,500</starting-salary>
<current-salary>33,500</current-salary>
</admin>
</staff>
<students>
<full-time>
<first-name>Henry</first-name>
<family-name>Nunes</family-name>
<status>Freshman</status>
<current-tuition>2,400</current-tuition>
<last-year-tuition>2,200</last-year-tuition>
</full-time>
<part-time>
<first-name>Leslie</first-name>
<family-name>Franks</family-name>
<status>Senior</status>
<current-tuition>2,300</current-tuition>
<last-year-tuition>2,100</last-year-tuition>
</students>

IMPORTDATA, as another user was advised with a similar question here, failed because some of the XML data have commas, similar to the sample output above.


Solution

  • if the url of yours can be scraped from behind authentication credentials you could try:

    =ARRAYFORMULA(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
     ARRAY_CONSTRAIN(IMPORTDATA("url_here"), 5000, 20)),,999^99))))
    

    also you could perhaps try:

    =IMPORTXML("url_here", "//*")