I am trying to parse some data that I am retrieving from an API but I keep getting the following error when I go to use fromJSON():
Error: parse error: trailing garbage
01/28/20|000010102|St. John's|OZ
(right here) ------^
Script that isn't working:
library(httr)
library(jsonlite)
library(tidyverse)
url<-"https://urldefense.com/v3/__http://files.airnowtech.org/airnow/yesterday/daily_data_v2.dat__;!!J30X0ZrnC1oQtbA!Yh5wIss-mzbpMRXugALJoWEKLKcg1-7VmERQwcx2ESK0PZpM5NWNml5s9MVgwHr5LD1i5w$ "
my_raw_result<-httr::GET(url)
my_content<-httr::content(my_raw_result,as="text")
my_content_from_json<-jsonlite::fromJSON(my_content)
I checked the status and its 200 and when I ran http_type(my_raw_result)
It says "application/octet-stream. This is my first time trying to access data from an API so I have no clue what this means. Should I be using a different function to parse? I would appreciate any guidance.
That data source is not in JSON format. For example, the first three lines look like this:
[1] "01/28/20|000010102|St. John's|OZONE-1HR|PPB|37|1|Newfoundland & Labrador DEC|-999|-999|47.652800|-52.816700|124000010102"
[2] "01/28/20|000010102|St. John's|OZONE-8HR|PPB|35|8|Newfoundland & Labrador DEC|32|0|47.652800|-52.816700|124000010102"
[3] "01/28/20|000010501|Grand Falls Windsor|OZONE-1HR|PPB|40|1|Newfoundland & Labrador DEC|-999|-999|49.019400|-55.802800|124000010501"
It would be good to check with the original source about the definitions of the format, but it looks like a delimited format with |
used to separate columns. If that's true, here's one way to read it from your my_content
variable:
my_content_from_delim <- my_content %>% textConnection %>% readLines %>% read.delim(text = ., sep = "|")