I have a problem when extracting attributes from xml to R. I have the xml file as follow:
- <export>
+ <ExportRef>
- <BookNodes>
- <Book label="romance">
+ <Showing>
- <Data>
+ <Char1 label="Char1">
- <Char2 label="Char2">
+ <SubChar21>
- <SubChar22>
<Range unit="nm">4</Range>
<Range unit="nm">8</Range>
</SubChar22>
- <Char3 label="Char3">
+ <SubChar31>
- <SubChar32>
<Range Id="1">voc</Range>
<Range Id="2">buc</Range>
</SubChar32>
</Data>
</Book>
- <Book label="horror">
+ <Showing>
- <Data>
+ <Char1 label="Char1">
- <Char2 label="Char2">
+ <SubChar21>
- <SubChar22>
<Range unit="nm">4</Range>
<Range unit="nm">8</Range>
</SubChar22>
- <Char3 label="Char3">
+ <SubChar31>
- <SubChar32>
<Range Id="1">voc</Range>
<Range Id="2">buc</Range>
</SubChar32>
</Data>
</Book>
</BookNodes>
</export>
I would like to have a list of the Range Id only for each book categories. For example:
romance:
id id
1 2
horror:
id id
1 2
When I do something like that:
RangeID_1<-xpathSApply(AC_Node[[1]][[2]], ".//Range", xmlAttrs)
I get:
unit unit id id
"nm" "nm" "1" "2"
How to say to R that I only want the Range Id and not the Range unit?
Thank you very much!!
My two cents with rvest:
library(rvest)
read_xml("your_xml_file.xml") %>%
xml_nodes("Range") %>%
xml_attr("Id")