Search code examples
jsonrescapingdouble-quotes

Escape quotes in R before assigning a string


I am trying to load a JSON file and do some analysis in R.

The JSON file contains parts like this:

 '{"property":"blabla \"some goofy name\" more blabla"}'

Which means there are a couple of double quotes inside a string value of a property. This is supposed to be valid JSON (or not?).

The problem is that if I try to parse it with jsonlite or any other library, I need to have it assigned to a string variable in R. Like that:

 a = '{"property":"blabla \"some goofy name\" more blabla"}'

but then, if I type a and press enter, I get this back:

[1] "{\"property\":\"blabla \"some goofy name\" more blabla\"}"

Which means that the already existing \" instances are now equal to the actual " instances, so I can't even replace them with regular expression. If I feed this to any JSON parsing library there are errors with invalid characters etc.

Is there any way to 'catch' those nasty \" instances before R considers them the same with plain ", so that I can eliminate the \" and continue the JSON parsing?

The difference with a similar issue is that the inner quotes are already escaped forming a valid JSON. My ultimate challenge is to parse this JSON: http://next.openspending.org/api/3/cubes/ba94aabb80080745688ad38ccad9bfea:at-austria-at11-burgenland/facts?pagesize=30


Solution

  • Updated answer following the OP's update

    I think I may still have not understood 100% what you want to accomplish, so let me know if this is not your intended output. I didn't deal with the newline characters in your file since that doesn't seem relevant. Your file contains strings that contain "\"Bienenkorb\"" as you described.

    url <- "http://next.openspending.org/api/3/cubes/ba94aabb80080745688ad38ccad9bfea:at-austria-at11-burgenland/facts?pagesize=30"
    parsed <- jsonlite::fromJSON(url)
    print(parsed$data$activity_project_id.project_name[3])
    #[1] "Neugestaltung und\nModernisierung des\nRestaurants \"Bienenkorb\""
    cat(parsed$data$activity_project_id.project_name[3])
    #Neugestaltung und
    #Modernisierung des
    #Restaurants "Bienenkorb"
    

    If you want to assign it to a string and then parse it, you can do s <- readLines(url); parsed <- jsonlite::fromJSON(s).