I would like to import 2GB JSON file as a dataframe or write to CSV.
I tried RJSONIO::fromJSON("review.json")
but it returns this error:
Error in paste(readLines(content), collapse = "\n") :
result would exceed 2^31-1 bytes
So I tried readr::read_lines_raw("review.json")
and each line was being read as:
[1000] "{\"review_id\":\"VcBo0OZVwTmMh278aakFUg\",\"user_id\":\"PV0Rp_Qh1YCIP0192e4ewg\",\"business_id\":\"G7sVtpD6aqpuUB4F3LEG_w\", \"stars\":4.0,\"useful\":0,\"funny\":0,\"cool\":0,\"text\":\"Excellente place que vous passiez juste prendre un bon thé ou café ou que vous vouliez vous asseoir et manger un brownie décadent ou un grill cheese aux oignons caramélisés. Le personnel est sympathique, pas stressé et ne met pas de pression pour consommer. Les enfants sont les très bienvenus et ont de quoi s occuper!!\",\"date\":\"2015-07-03 19:01:30\"}"
It's likely that some lines exceeded the maximum length of a character string in R, which is why I got this error.
I tried the stream_in
method as well but it breaks halfway through streaming the data.
Is there a way to import this file without having each line be an entire string? Or an entirely alternative method to using fromJSON()
?
I do not have enough 'rep' to comment, but thought you might give this a try:
library(jsonlite)
my.jsonFile<-'review.json'
x<-jsonlite::fromJSON(my.jsonFile,flatten=TRUE)
y<-lapply(rapply(x, enquote, how="unlist"), eval)