Working with an xml data that was downloaded using GET function from the httr package. The content returned is of type application/xml. An extract is as shown below:
<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://example.com/schema/dxf/2.0">
<pager>
<page>1</page>
<pageCount>17</pageCount>
<total>819</total>
<pageSize>50</pageSize>
<nextPage>https://xxx.org.ng/xxx/api/indicators?page=2&format=xml</nextPage>
</pager>
<indicators>
<indicator id="cfvVUWwkwje">
<displayName> ART Total </displayName>
</indicator>
<indicator id="gytvOB3J7">
<displayName> ART Microscopy - Total</displayName>
</indicator>
<indicator id="5fgtZdtvQRW">
<displayName> ART Microscopy Biology - Total</displayName>
</indicator>
<indicator id="g6hYenEHnsu">
<displayName> ART GeneXpert - Total </displayName>
</indicator>
<indicator id="hhjxxDlG87j">
<displayName> ART Functional -Total</displayName>
</indicator>
<indicator id="SarCtUBpBru">
<displayName> ART 21 - Total</displayName>
</indicator>
<indicator id="ftywhPKoMgp">
<displayName> Buruli Ulcer Total</displayName>
</indicator>
<indicator id="gyyhtAzCQZ0">
<displayName> xART 21 prophylaxis Functional -Total</displayName>
</indicator>
<indicator id="vftWafaROyq0">
<displayName> xART 21 Non Functional - Total</displayName>
</indicator>
</indicators>
</metadata>
I used the following code to download and tried to convert the xml to dataframe as shown below:
url_xml <- modify_url(url1, path = path)
xml_response <- GET(url_xml, authenticate(username, password))
http_type(xml_response)
resp_content <- content(xml_response)
parsed_content <- xmlParse(resp_content)
# get the root
parsed_xml_root <- xmlRoot(parsed_content)
# parse out names and IDs
df_xml <- xmlToDataFrame(nodes = getNodeSet(parsed_xml_root,"//indicators/indicator/displayName"))
id <- xmlSApply(parsed_xml_root[["indicator"]], xmlGetAttr, "id")
all_values_df <- cbind(df_xml, id)
I want to get the id of the indicator and the display name. The resulting dataframe was empty. Please any suggestion
I got it solved
df_xml <- xmlToDataFrame(nodes = xmlChildren(xmlRoot(parsed_content)[["indicators"]]))