My question might be fairly simple, but I'm having problem to work with xml. I have a list of metabolites and a data base where I can find information about them in an xml format. I'm trying to create a table of synonyms so I can translate the metabolite names I have to one more suited for the downstream analysis. Here is a simple code where I'm trying to access the synonyms node, and for some reason is not working. I tried another xml file with success. Also, any tip on how to build this table will be appreciated.
metabolites <- read_xml('<?xml version="1.0" encoding="UTF-8"?>
<hmdb xmlns="">
<creation_date>2005-11-16 15:48:42 UTC</creation_date>
<update_date>2019-01-11 19:13:56 UTC</update_date>
<cs_description>1-Methylhistidine, also known as 1-mhis...</cs_description>
<description>One-methylhistidine (1-MHis) is derived ...</description>
<synonym>(2S)-2-amino-3-(1-Methyl-1H-imidazol-4-yl)propanoic acid</synonym>
<synonym>1 Methylhistidine</synonym>
syn <- xml_find_all(metabolites, "//synonyms")
It has to do with the namespace declaration. See the discussion here:
metabolites <- read_xml('<hmdb xmlns="">
<creation_date>2005-11-16 15:48:42 UTC</creation_date>
<update_date>2019-01-11 19:13:56 UTC</update_date>
<cs_description>1-Methylhistidine, also known as 1-mhis...</cs_description>
<description>One-methylhistidine (1-MHis) is derived ...</description>
<synonym>(2S)-2-amino-3-(1-Methyl-1H-imidazol-4-yl)propanoic acid</synonym>
<synonym>1 Methylhistidine</synonym>
# namespace d1
#> d1 <->
#doesn't work
xml_find_all(metabolites, "//synonyms")
#> {xml_nodeset (0)}
xml_find_all(metabolites, "//d1:synonyms")
#> {xml_nodeset (1)}
#> [1] <synonyms>\n <synonym>(2S)-2-amino-3-(1-Methyl-1H-imidazol-4-yl)pro ...
Created on 2019-11-09 by the reprex package (v0.3.0)