Search code examples
jsonrrmongodb

JSON - Character issue


I extracted some data from a mongo database using the RMongo library. I have been working with the data with no problem. However, I need to access a field that was saved, originally in the database, as JSON. Since rmongodb saves the data as data frame, I now have a large character vector of length 1:

res1 = "[ { \"text\" : \"@Kayture Beyoncé jam session ?\" , \"name\" : \"beponcé \xed\xa0\xbc\xed\xbc\xbb\" , \"screenName\" : \"ColaaaaTweedy\" , \"follower\" : false , \"mentions\" : [ \"Kayture\"] , \"userTwitterId\" : \"108061963\"} , { \"text\" : \"@Kayture fucking marry me\" , \"name\" : \"George McQueen\" , \"screenName\" : \"GeorgeMcQueen12\" , \"follower\" : false , \"mentions\" : [ \"Kayture\"] , \"userTwitterId\" : \"67896750\"}]"

I need to extract all the "text" attributes of the objects from this array (there are 2 in this example), but I can not figure out a fast way. I was trying using strsplit, or going from character to json files using jsonlite, and then to list, but it does not work.

Any ideas?

Thanks!


Solution

  • Starting from

    res1 = "[ { \"text\" : \"@Kayture Beyoncé jam session ?\" , \"name\" : \"beponcé \xed\xa0\xbc\xed\xbc\xbb\" , \"screenName\" : \"ColaaaaTweedy\" , \"follower\" : false , \"mentions\" : [ \"Kayture\"] , \"userTwitterId\" : \"108061963\"} , { \"text\" : \"@Kayture fucking marry me\" , \"name\" : \"George McQueen\" , \"screenName\" : \"GeorgeMcQueen12\" , \"follower\" : false , \"mentions\" : [ \"Kayture\"] , \"userTwitterId\" : \"67896750\"}]"
    

    you can use fromJSON() from the jsonlite package to parse that JSON object.

    library(jsonlite)
    
    fromJSON(res1)
    
                                text           name      screenName follower mentions userTwitterId
    1 @Kayture Beyoncé jam session ? beponcé í ¼í¼»   ColaaaaTweedy    FALSE  Kayture     108061963
    2 @Kayture fucking marry me      George McQueen GeorgeMcQueen12    FALSE  Kayture      67896750