Search code examples
rxmlxml-parsingxml2

How to refer to an XML node with specific id


I am trying to parse XML returned by API as below:

library(httr2)
library(xml2)
library(tidyverse)

resp_xml <- request("https://data-api.ecb.europa.eu/service/data/CBD2/A..W0.11._Z._Z.A.A.A0000._X.ALL.CA._Z.LE._T.EUR") %>%
  req_perform() %>%
  resp_body_xml()

xml2::xml_find_all(resp_xml, "//generic:SeriesKey") %>% 
  xml2::xml_find_all("//generic:Value")

How can I extract only the values for id="REF_ARA"? I want to get the list just with country codes like "AT", "BE" etc.. I've tried something like this but I get only NAs:

xml2::xml_find_all(resp_xml, "//generic:SeriesKey") %>% 
  xml2::xml_find_all("//generic:Value") %>% 
  xml2::xml_attr("REF_AREA")

Solution

  • If you're trying to get the values of value for nodes where id="REF_AREA", you can do so using XPath:

    xml_find_all(resp_xml, "//generic:SeriesKey") %>% 
      xml_find_all("//generic:Value[@id='REF_AREA']") %>% 
      xml_attr("value")
    
    #>  [1] "AT" "B0" "BE" "BG" "CY" "CZ" "DE" "DK" "EE" "ES" "FI" "FR" "GB" "GR" "HR"
    #> [16] "HU" "IE" "IT" "LT" "LU" "LV" "MT" "NL" "PL" "PT" "RO" "SE" "SI" "SK" "U2"
    

    Basically, you are trying to find nodes where the value of id (i.e. an attribute) is equal to "REF_AREA". The code below could help you to better understand this concept:

    xml_find_all(resp_xml, "//generic:SeriesKey") %>% 
      xml_find_all("//generic:Value") %>% 
      .[xml_attr(., "id") == "REF_AREA"]
    
    #> {xml_nodeset (30)}
    #>  [1] <generic:Value id="REF_AREA" value="AT"/>
    #>  [2] <generic:Value id="REF_AREA" value="B0"/>
    #>  [3] <generic:Value id="REF_AREA" value="BE"/>
    #>  [4] <generic:Value id="REF_AREA" value="BG"/>
    #>  [5] <generic:Value id="REF_AREA" value="CY"/>
    #>  [6] <generic:Value id="REF_AREA" value="CZ"/>
    #>  [7] <generic:Value id="REF_AREA" value="DE"/>
    #>  [8] <generic:Value id="REF_AREA" value="DK"/>
    #>  [9] <generic:Value id="REF_AREA" value="EE"/>
    #> [10] <generic:Value id="REF_AREA" value="ES"/>
    #> [11] <generic:Value id="REF_AREA" value="FI"/>
    #> [12] <generic:Value id="REF_AREA" value="FR"/>
    #> [13] <generic:Value id="REF_AREA" value="GB"/>
    #> [14] <generic:Value id="REF_AREA" value="GR"/>
    #> [15] <generic:Value id="REF_AREA" value="HR"/>
    #> [16] <generic:Value id="REF_AREA" value="HU"/>
    #> [17] <generic:Value id="REF_AREA" value="IE"/>
    #> [18] <generic:Value id="REF_AREA" value="IT"/>
    #> [19] <generic:Value id="REF_AREA" value="LT"/>
    #> [20] <generic:Value id="REF_AREA" value="LU"/>
    #> ...
    

    Created on 2023-12-06 with reprex v2.0.2