Search code examples
jsonrdplyrsapply

Extract JSON data from the rows of an R data frame


I have a data frame where the values of column Parameters are Json data:

#  Parameters
#1 {"a":0,"b":[10.2,11.5,22.1]}
#2 {"a":3,"b":[4.0,6.2,-3.3]}
...

I want to extract the parameters of each row and append them to the data frame as columns A, B1, B2 and B3.

How can I do it?

I would rather use dplyr if it is possible and efficient.


Solution

  • In your example data, each row contains a json object. This format is called jsonlines aka ndjson, and the jsonlite package has a special function stream_in to parse such data into a data frame:

    # Example data
    mydata <- data.frame(parameters = c(
      '{"a":0,"b":[10.2,11.5,22.1]}',
      '{"a":3,"b":[4.0,6.2,-3.3]}'
    ), stringsAsFactors = FALSE)
    
    # Parse json lines
    res <- jsonlite::stream_in(textConnection(mydata$parameters))
    
    # Extract columns
    a <- res$a
    b1 <- sapply(res$b, "[", 1)
    b2 <- sapply(res$b, "[", 2)
    b3 <- sapply(res$b, "[", 3)
    

    In your example, the json structure is fairly simple so the other suggestions work as well, but this solution will generalize to more complex json structures.