Search code examples
rxmlhttr

Empty dataframe returned from xml data using xmlToDataframe in R


Working with an xml data that was downloaded using GET function from the httr package. The content returned is of type application/xml. An extract is as shown below:

<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://example.com/schema/dxf/2.0">
  <pager>
    <page>1</page>
    <pageCount>17</pageCount>
    <total>819</total>
    <pageSize>50</pageSize>
    <nextPage>https://xxx.org.ng/xxx/api/indicators?page=2&amp;format=xml</nextPage>
  </pager>
  <indicators>
    <indicator id="cfvVUWwkwje">
      <displayName> ART Total </displayName>
    </indicator>
    <indicator id="gytvOB3J7">
      <displayName> ART Microscopy - Total</displayName>
    </indicator>
    <indicator id="5fgtZdtvQRW">
      <displayName> ART Microscopy Biology - Total</displayName>
    </indicator>
    <indicator id="g6hYenEHnsu">
      <displayName> ART GeneXpert - Total </displayName>
    </indicator>
    <indicator id="hhjxxDlG87j">
      <displayName> ART Functional -Total</displayName>
    </indicator>
    <indicator id="SarCtUBpBru">
      <displayName> ART 21 - Total</displayName>
    </indicator>
    <indicator id="ftywhPKoMgp">
      <displayName> Buruli Ulcer Total</displayName>
    </indicator>
    <indicator id="gyyhtAzCQZ0">
      <displayName> xART 21 prophylaxis Functional -Total</displayName>
    </indicator>
    <indicator id="vftWafaROyq0">
      <displayName> xART 21 Non Functional - Total</displayName>
    </indicator>
    </indicators>
</metadata>

I used the following code to download and tried to convert the xml to dataframe as shown below:

url_xml <- modify_url(url1, path = path)
xml_response <- GET(url_xml, authenticate(username, password))

http_type(xml_response)
resp_content <- content(xml_response)

parsed_content <- xmlParse(resp_content)

# get the root
parsed_xml_root <- xmlRoot(parsed_content)
# parse out names and IDs
df_xml <- xmlToDataFrame(nodes = getNodeSet(parsed_xml_root,"//indicators/indicator/displayName"))
id <- xmlSApply(parsed_xml_root[["indicator"]], xmlGetAttr, "id")
all_values_df <- cbind(df_xml, id)

I want to get the id of the indicator and the display name. The resulting dataframe was empty. Please any suggestion


Solution

  • I got it solved

    df_xml <- xmlToDataFrame(nodes = xmlChildren(xmlRoot(parsed_content)[["indicators"]]))