Search code examples
rxml

write xml from api response to xml file in R


I am using an API which returns some xml (in a text format) that I need to write to an xml file, for others to work with in another programming language (C# / .NET).

The content of the API response is a list with two entries, of which the second is as character called $xml, which (structually) looks like this:

[1] "\n\n<xx05results version="10.19.7.22" date="3.08.2023 15.05" company="yyy" model="Zone 1">\n<table head="Resultater: Zone 1">\n<table head="Climate: Norway, Oslo">\n<table head="Behov">\n\t\n\t\tMWhJanFebMarAprMajJunJulAugSepOktNovDecÅret\n\t\n\t\n\t\tVarme4792,153535,953040,671989,00235,910,000,000,00675,682113,743371,184365,0224119,30\n\t\n\t\n\t\tEl 2018-2917,82-2152,89-1851,26-1210,87-143,330,310,320,32-411,13-1286,82-2052,54-2657,72-14683,42\n\t\n\t\n\t\tEl lavenergi-2917,82-2152,89-1851,26-1210,87-143,330,310,320,32-411,13-1286,82-2052,54-2657,72-14683,42\n\t\n\t\n\t\tOvertemperatur i rum0,000,000,0014,4758,1772,5081,3763,2621,510,000,000,00311,28\n\t\n\n<table head="Samlet energibehov">\n\t\n\t\tMWhJanFebMarAprMajJunJulAugSepOktNovDecÅret\n\t\n\t\n\t\tEksisterende bygning-751,70-554,55-476,73-297,1721,7673,0981,9763,87-83,96-331,22-528,64-684,64-3467,92\n\t\n\t\n\t\tkWh/m²-3666,8-2705,1-2325,5-1449,6106,1356,5399,9311,6-409,6-1615,7-2578,7-3339,7-16916,7\n\t\n\t\n\t\t2018-751,70-554,55-476,73-297,1721,7673,0981,9763,87-83,96-331,22-528,64-684,64-3467,92\n\t\n\t\n\t\tkWh/m²-3666,8-2705,1-2325,5-1449,6106,1356,5399,9311,6-409,6-1615,7-2578,7-3339,7-16916,7\n\t\n\t\n\t\tLavenergi-751,70-554,55-476,73-297,1721,7673,0981,9763,87-83,96-331,22-528,64-684,64-3467,92\n\t\n\t\n\t\tkWh/m²-3666,8-2705,1-2325,5-1449,6106,1356,5399,9311,6-409,6-1615,7-2578,7-3339,7-16916,7\n\t\n\n<table head="Varme.">\n\t\ ...

The xml-file is supposed to have a root node called <be05results version="10.19.7.22" date="4.08.2023 09.32" company="SBi" model=""Zone 1"> with associated child nodes called <table head="Behov">, <table head="Samlet energibehov">, etc., each of which have childs called <row>. <row>'s contain <id>, <jan>, <feb>, etc.

I was hoping that it would be possible to create an xml file (for each api call) that returns an xml file in the correct format, without having to specify the entire structure (with names), manually.

I am new to XML, and do therefore not know exactly where to start.

Any help would be much appreciated!


Solution

  • Assuming the string is valid XML, save it to a file as-is and pass it on; you don't need to parse it nor worry about its structure just to save it.

    The example bellow uses readr for reading/writing and httr2 for API requests, a mock http server is provided by webfakes. xml2 is used only to get a path for a XML example and to check if the saved file can indeed be parsed.

    library(httr2)
    library(xml2)
    library(readr)
    
    # example file bundled with xml2:
    xml_example_str <- read_file(xml2_example("cd_catalog.xml"))
    # first few lines:
    substr(xml_example_str, 1, 300)
    #> [1] "<?xml version=\"1.0\" encoding=\"ISO8859-1\" ?>\r\n<CATALOG>\r\n  <CD>\r\n    <TITLE>Empire Burlesque</TITLE>\r\n    <ARTIST>Bob Dylan</ARTIST>\r\n    <COUNTRY>USA</COUNTRY>\r\n    <COMPANY>Columbia</COMPANY>\r\n    <PRICE>10.90</PRICE>\r\n    <YEAR>1985</YEAR>\r\n  </CD>\r\n  <CD>\r\n    <TITLE>Hide your heart</TITLE>\r\n    "
    
    # start example http server, 
    # send xml string in json, 
    # same content will be included in json response 
    resp <- request(example_url()) %>%
      req_url_path("/post") %>% 
      req_body_json(list(example_xml = xml_example_str)) %>% 
      req_perform()
    resp
    #> <httr2_response>
    #> POST http://127.0.0.1:57784/post
    #> Status: 200 OK
    #> Content-Type: application/json
    #> Body: In memory (12457 bytes)
    
    # extract and parse json from response 
    parsed_response <- resp %>% 
      resp_body_json()
    
    # check the start of delivered xml string
    substr(parsed_response$json$example_xml, 1, 300)
    #> [1] "<?xml version=\"1.0\" encoding=\"ISO8859-1\" ?>\r\n<CATALOG>\r\n  <CD>\r\n    <TITLE>Empire Burlesque</TITLE>\r\n    <ARTIST>Bob Dylan</ARTIST>\r\n    <COUNTRY>USA</COUNTRY>\r\n    <COMPANY>Columbia</COMPANY>\r\n    <PRICE>10.90</PRICE>\r\n    <YEAR>1985</YEAR>\r\n  </CD>\r\n  <CD>\r\n    <TITLE>Hide your heart</TITLE>\r\n    "
    
    # save
    readr::write_file(parsed_response$json$example_xml, "out.xml")
    fs::dir_info(glob = "*.xml")[1:3]
    #> # A tibble: 1 × 3
    #>   path       type         size
    #>   <fs::path> <fct> <fs::bytes>
    #> 1 out.xml    file        4.96K
    
    # read and parse save file
    read_xml("out.xml") |> print(max_n = 5)
    #> {xml_document}
    #> <CATALOG>
    #> [1] <CD>\n  <TITLE>Empire Burlesque</TITLE>\n  <ARTIST>Bob Dylan</ARTIST>\n   ...
    #> [2] <CD>\n  <TITLE>Hide your heart</TITLE>\n  <ARTIST>Bonnie Tylor</ARTIST>\n ...
    #> [3] <CD>\n  <TITLE>Greatest Hits</TITLE>\n  <ARTIST>Dolly Parton</ARTIST>\n   ...
    #> [4] <CD>\n  <TITLE>Still got the blues</TITLE>\n  <ARTIST>Gary More</ARTIST>\ ...
    #> [5] <CD>\n  <TITLE>Eros</TITLE>\n  <ARTIST>Eros Ramazzotti</ARTIST>\n  <COUNT ...
    #> ...
    

    Created on 2023-08-04 with reprex v2.0.2