Search code examples
rjsontidyrbroomrjson

How to read nested json file and pull information from dataframe inside dataframe in r?


I am reading a json file as a response from api which is nested and when I look at them in a dataframe/table structure format then there are data frame under data frame.

I have saved file at github location.

library(tidyverse)
library(rjson)
library(jsonlite)
library(RCurl)
library(httr)
library(broom)

# file is available at this github link:

file_url1 <- "https://raw.githubusercontent.com/johnsnow09/covid19-df_stack-code/main/json_response.json"

json_response <- fromjson(url(file_url1))

I am interested in data inside sessions variable and have tried using map, tidy but it didn't work.

json_response %>%

#converting dataframe cols to list
  map_if(., is.data.frame, list) %>% 
  
  # to tibble
  map_if(is_list, tibble) %>%
  mutate(sessions = map(sessions, broom::tidy),
         vaccine_fees = map(vaccine_fees, tidy)) %>% 
  unnest()

Above code is not working!!

Also read from another SO post Converting nested JSON file to R dataframe and tried:

library(rjson)

url(file_url1) %>% 
  # jsonlite::fromJSON()
  rjson::fromJSON()
map_dfr(json_response[[1]]$centers[[1]]$sessions[[1]], as.tibble)

But have not been able to get the correct information.

The purpose of this to get information about following fields from the json file: district_name,name,min_age_limit,available_capacity,available_capacity_dose1,vaccine and then can apply filters on it based on the desired values.


Solution

  • You can unnest the sessions variable and the slots variable within it to get a long dataframe.

    library(magrittr)
    library(tidyr)
    
    jsonlite::fromJSON(file_url1) %>%
      .[[1]] %>% 
      unnest(sessions) %>%
      unnest(slots)
    
    # A tibble: 280 x 20
    #   center_id name    address   state_name district_name block_name pincode   lat  long from 
    #       <int> <chr>   <chr>     <chr>      <chr>         <chr>        <int> <int> <int> <chr>
    # 1      1273 CGHS W… CGHS Pat… Delhi      West Delhi    Not Appli…  110008    28    77 09:0…
    # 2      1273 CGHS W… CGHS Pat… Delhi      West Delhi    Not Appli…  110008    28    77 09:0…
    # 3      1273 CGHS W… CGHS Pat… Delhi      West Delhi    Not Appli…  110008    28    77 09:0…
    # 4      1273 CGHS W… CGHS Pat… Delhi      West Delhi    Not Appli…  110008    28    77 09:0…
    # 5    594035 Tihar … Tihar Ja… Delhi      West Delhi    Not Appli…  110064    28    77 09:0…
    # 6    594035 Tihar … Tihar Ja… Delhi      West Delhi    Not Appli…  110064    28    77 09:0…
    # 7    594035 Tihar … Tihar Ja… Delhi      West Delhi    Not Appli…  110064    28    77 09:0…
    # 8    594035 Tihar … Tihar Ja… Delhi      West Delhi    Not Appli…  110064    28    77 09:0…
    # 9    594035 Tihar … Tihar Ja… Delhi      West Delhi    Not Appli…  110064    28    77 09:0…
    #10    594035 Tihar … Tihar Ja… Delhi      West Delhi    Not Appli…  110064    28    77 09:0…
    # … with 270 more rows, and 10 more variables: to <chr>, fee_type <chr>, session_id <chr>,
    #   date <chr>, available_capacity <int>, min_age_limit <int>, vaccine <chr>, slots <chr>,
    #   available_capacity_dose1 <int>, available_capacity_dose2 <int>