Search code examples
rxmlurlxml-parsingextract

Error in the code while trying to read from URL using XML, xmlParse


I have to read the below link and answer the below question after the code http://www.ggobi.org/book/data/australian-crabs.xml

      library(XML)
crabs <- xmlParse('http://www.ggobi.org/book/data/australian-crabs.xml')
root <- xmlRoot(crabs)
xmlName(root)
varInfo <- root[[1]][[2]]
varInfo

as.vector(unlist(xmlApply(varInfo, xmlAttrs)))
     

The question is How many total number of records are does records have? Extract the text value for record 11.

**my answer is **

library(XML)

# download the xml data and save it in a file
url <- "http://www.ggobi.org/book/data/australian-crabs.xml"
download.file(url, destfile = "australian-crabs.xml")

# parse the xml data
xml_data <- xmlParse("australian-crabs.xml")

# extract the data from the xml
datPath <- "//record"
datValue <- xpathApply(xml_data, datPath, xmlValue)

# find the total number of records
num_records <- length(datValue)
cat("Total number of records:", num_records, "\n")

# extract the text value for record 11
record11 <- xmlParse(datValue[11])
text_value <- xmlValue(getNodeSet(record11, ".//text")[[1]])
cat("Text value for record 11:", text_value, "\n")

but I have the below error

trying URL 'http://www.ggobi.org/book/data/australian-crabs.xml'
Content type 'application/xml' length 20212 bytes (19 KB)
==================================================
downloaded 19 KB

Total number of records: 208 
Error in file.exists(file) : invalid 'file' argument

I need to solve this error


Solution

  • You already extracted the values of the records when you ran

    datValue <- xpathApply(xml_data, datPath, xmlValue)
    

    so datValue does not contain xml, just the values. You need to use

    text_value <- datValue[[11]]