I am trying to query the Cameo database.
If I use the URL https://cameo.mfa.org/api.php?action=query&pageids=17051&prop=extracts&format=json, then I get, online, a valid output.
However, if I use:
library(httr)
library(jsonlite)
base_url <- "https://cameo.mfa.org/api.php"
query_param <- list(action = "query",
pageids = "17051",
format = "json",
prop = "extracts"
)
parsed_content <- httr::GET(base_url, query_param)
jsonlite::fromJSON(content(parsed_content, as = "text", encoding = "UTF-8"))
Then jsonlite
fails because the output is in html format and not json.
Do you have any advice on this?
The second argument to httr::GET
is config=
, which is not where you should be assigning query_param
. Instead name it as query=query_param
.
res <- httr::GET(base_url, query = query_param)
res
# Response [https://cameo.mfa.org/api.php?action=query&pageids=17051&format=json&prop=extracts]
# Date: 2023-07-03 15:06
# Status: 200
# Content-Type: application/json; charset=utf-8
# Size: 5.22 kB
str(httr::content(res))
# List of 3
# $ batchcomplete: chr ""
# $ warnings :List of 1
# ..$ extracts:List of 1
# .. ..$ *: chr "HTML may be malformed and/or unbalanced and may omit inline images. Use at your own risk. Known problems are li"| __truncated__
# $ query :List of 1
# ..$ pages:List of 1
# .. ..$ 17051:List of 4
# .. .. ..$ pageid : int 17051
# .. .. ..$ ns : int 0
# .. .. ..$ title : chr "Copper"
# .. .. ..$ extract: chr "<h2><span id=\"Description\">Description</span></h2>\n<p>A reddish-brown, ductile, metallic element. Copper is "| __truncated__