Search code examples
rjsonnested-listsjsonlite

Flattern sublists within nested list for tidy json file output without unwanted array brackets


so in a nutshell I barely figured out how to generate a huge nested list from a csv file and the aim was to output it in nice, tidy json format.

However, the problem is my json output has given each sublist a "[]" as data array in the json output, resulting too many brackets "[]" and it sort of messed up the next step process.

I have read a little on this and having pretty and auto_unbox only remote a couple brackets.

jsonlite::toJSON(updated_nested_lists, pretty=TRUE,auto_unbox = TRUE)

the desired json should look like this

enter image description here

my json now look like this

enter image description here

I think that means I should unpack my nested list at certain levels to achieve this. But I am stuck on how to unpack them inside out for just the particular lists, when I played the "orders" sublist with unlist, it sort of smashed all things together and lost the inner most data structure. And I would like to do this for the full data (same structure, much much bigger)

minimum reproducible data here:

list(list(startTimestamp = structure(1620293100, class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), endTimestamp = structure(1620293998.66, class = 
c("POSIXct", 
"POSIXt"), tzone = "UTC"), orders = structure(list(structure(list(
    timestamp = structure(c(1620293100.88, 1620293100.88, 1620293100.88, 
    1620293100.88, 1620293100.88), tzone = "UTC", class = c("POSIXct", 
    "POSIXt")), tradePrice = c(22.63, 22.63, 22.63, 22.63, 22.63
    ), type = c("Mid", "Mid", "Mid", "Mid", "Mid"), volume = c(100L, 
    100L, 100L, 100L, 100L), tradeSum = c(2263, 2263, 2263, 2263, 
    2263)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L)), structure(list(timestamp = structure(1620293100.88, tzone = "UTC", class = 
c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Mid", volume = 300L, 
    tradeSum = 6789), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293100.88, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Mid", volume = 1000L, 
    tradeSum = 22630), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293100.88, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Mid", volume = 3600L, 
    tradeSum = 81468), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293444.68, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.64, type = "Bid", volume = 200L, 
    tradeSum = 4528), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293446.74, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.64, type = "Ask", volume = 2700L, 
    tradeSum = 61128), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293453.78, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Bid", volume = 600L, 
    tradeSum = 13578), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293455.55, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.64, type = "Ask", volume = 100L, 
    tradeSum = 2264), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293457.92, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.64, type = "Ask", volume = 200L, 
    tradeSum = 4528), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293468.28, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Bid", volume = 500L, 
    tradeSum = 11315), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293470.52, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Bid", volume = 300L, 
    tradeSum = 6789), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293482.13, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Bid", volume = 700L, 
    tradeSum = 15841), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293482.13, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Bid", volume = 900L, 
    tradeSum = 20367), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293487.69, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Ask", volume = 200L, 
    tradeSum = 4526), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293501.04, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Bid", volume = 100L, 
    tradeSum = 2262), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293506.57, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Ask", volume = 400L, 
    tradeSum = 9052), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293531.71, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Bid", volume = 200L, 
    tradeSum = 4524), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293578.02, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.61, type = "Bid", volume = 200L, 
    tradeSum = 4522), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293585.77, tzone 
= "UTC", class = c("POSIXct", 
 "POSIXt")), tradePrice = 22.61, type = "Bid", volume = 100L, 
    tradeSum = 2261), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293588.74, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.61, type = "Bid", volume = 100L, 
    tradeSum = 2261), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293589.1, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.61, type = "Bid", volume = 100L, 
    tradeSum = 2261), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293589.1, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.61, type = "Bid", volume = 300L, 
    tradeSum = 6783), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293608.04, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.6, type = "Bid", volume = 100L, tradeSum = 2260), class = 
c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(
    timestamp = structure(1620293633.1, tzone = "UTC", class = c("POSIXct", 
    "POSIXt")), tradePrice = 22.6, type = "Bid", volume = 200L, 
    tradeSum = 4520), class = c("tbl_df", "tbl", "data.frame"
 ), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293639.58, 
tzone = "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.6, type = "Bid", volume = 200L, tradeSum = 4520), class = 
c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(
    timestamp = structure(1620293642.91, tzone = "UTC", class = c("POSIXct", 
    "POSIXt")), tradePrice = 22.6, type = "Ask", volume = 100L, 
    tradeSum = 2260), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293642.91, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.6, type = "Ask", volume = 2900L, 
    tradeSum = 65540), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293654.03, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.6, type = "Ask", volume = 300L, tradeSum = 6780), class = 
c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(
    timestamp = structure(1620293656.97, tzone = "UTC", class = c("POSIXct", 
    "POSIXt")), tradePrice = 22.62, type = "Ask", volume = 100L, 
    tradeSum = 2262), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293660, tzone = 
"UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Ask", volume = 700L, 
    tradeSum = 15834), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293663, tzone = 
"UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Ask", volume = 300L, 
    tradeSum = 6786), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293666.03, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Ask", volume = 200L, 
    tradeSum = 4524), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293672.01, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Ask", volume = 400L, 
    tradeSum = 9048), class = c("tbl_df", "tbl", "data.frame"
 ), row.names = c(NA, -1L)), structure(list(timestamp = structure(c(1620293674.65, 
 1620293674.65), tzone = "UTC", class = c("POSIXct", "POSIXt")), 
    tradePrice = c(22.62, 22.62), type = c("Ask", "Ask"), volume = c(200L, 
    200L), tradeSum = c(4524, 4524)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -2L)), structure(list(timestamp = 
 structure(1620293674.65, tzone = "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Ask", volume = 3600L, 
    tradeSum = 81432), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293677.64, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Ask", volume = 100L, 
    tradeSum = 2263), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293677.67, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Ask", volume = 2200L, 
    tradeSum = 49786), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293677.92, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Ask", volume = 800L, 
    tradeSum = 18104), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293686.34, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.6, type = "Bid", volume = 200L, tradeSum = 4520), class = 
c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -1L)), structure(list(
    timestamp = structure(1620293698.61, tzone = "UTC", class = c("POSIXct", 
    "POSIXt")), tradePrice = 22.62, type = "Bid", volume = 100L, 
    tradeSum = 2262), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293707.6, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Bid", volume = 400L, 
    tradeSum = 9048), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293723.01, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Bid", volume = 400L, 
    tradeSum = 9052), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293723.26, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Ask", volume = 300L, 
    tradeSum = 6789), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293723.29, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Ask", volume = 100L, 
    tradeSum = 2262), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293723.29, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Bid", volume = 100L, 
    tradeSum = 2262), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293723.29, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Ask", volume = 300L, 
    tradeSum = 6789), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293734.89, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Bid", volume = 200L, 
    tradeSum = 4524), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293743.34, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Bid", volume = 800L, 
    tradeSum = 18096), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293743.52, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.62, type = "Ask", volume = 1900L, 
    tradeSum = 42978), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(c(1620293743.95, 
1620293743.95), tzone = "UTC", class = c("POSIXct", "POSIXt")), 
    tradePrice = c(22.62, 22.62), type = c("Ask", "Ask"), volume = c(1000L, 
    1000L), tradeSum = c(22620, 22620)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -2L)), structure(list(
    timestamp = structure(c(1620293771.07, 1620293771.07), tzone = "UTC", class = 
c("POSIXct", 
    "POSIXt")), tradePrice = c(22.61, 22.61), type = c("Bid", 
    "Bid"), volume = c(1400L, 1400L), tradeSum = c(31654, 31654
    )), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-2L)), structure(list(timestamp = structure(1620293771.07, tzone = "UTC", class = 
c("POSIXct", 
 "POSIXt")), tradePrice = 22.62, type = "Bid", volume = 400L, 
    tradeSum = 9048), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293772.39, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.61, type = "Ask", volume = 700L, 
    tradeSum = 15827), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293772.42, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.61, type = "Ask", volume = 900L, 
    tradeSum = 20349), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(c(1620293772.78, 
1620293772.78), tzone = "UTC", class = c("POSIXct", "POSIXt")), 
     tradePrice = c(22.61, 22.61), type = c("Ask", "Ask"), volume = c(100L, 
    100L), tradeSum = c(2261, 2261)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -2L)), structure(list(timestamp = 
structure(c(1620293780.41, 
1620293780.41), tzone = "UTC", class = c("POSIXct", "POSIXt")), 
    tradePrice = c(22.62, 22.62), type = c("Ask", "Ask"), volume = c(500L, 
    500L), tradeSum = c(11310, 11310)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -2L)), structure(list(timestamp = 
structure(1620293815.65, tzone = "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.64, type = "Ask", volume = 100L, 
    tradeSum = 2264), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293845.36, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.63, type = "Bid", volume = 100L, 
    tradeSum = 2263), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293845.36, tzone 
= "UTC", class = c("POSIXct", 
 "POSIXt")), tradePrice = 22.63, type = "Bid", volume = 700L, 
    tradeSum = 15841), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293892.3, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.73, type = "Bid", volume = 100L, 
    tradeSum = 2273), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L)), structure(list(timestamp = structure(1620293998.66, tzone 
= "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = 22.66, type = "Bid", volume = 300L, 
    tradeSum = 6798), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L))), ptype = structure(list(timestamp = structure(numeric(0), 
tzone = "UTC", class = c("POSIXct", 
"POSIXt")), tradePrice = numeric(0), type = character(0), volume = integer(0), 
    tradeSum = numeric(0)), class = c("tbl_df", "tbl", "data.frame"
), row.names = integer(0)), class = c("vctrs_list_of", "vctrs_vctr", 
"list"))))

Solution

  • Your list is indeed too deeply nested, try recursive= unlisting. rapply can be helpful here.

    > unlist(rapply(updated_nested_lists, unlist, how='l', recursive=FALSE), recursive=FALSE) |>
    +   jsonlite::toJSON(pretty=TRUE, auto_unbox=TRUE)
    {
      "startTimestamp": "2021-05-06 09:25:00",
      "endTimestamp": "2021-05-06 09:39:58",
      "orders": [
        {
          "timestamp": ["2021-05-06 09:25:00", "2021-05-06 09:25:00", "2021-05-06 09:25:00", "2021-05-06 09:25:00", "2021-05-06 09:25:00"],
          "tradePrice": [22.63, 22.63, 22.63, 22.63, 22.63],
          "type": ["Mid", "Mid", "Mid", "Mid", "Mid"],
          "volume": [100, 100, 100, 100, 100],
          "tradeSum": [2263, 2263, 2263, 2263, 2263]
        },
        {
          "timestamp": "2021-05-06 09:25:00",
          "tradePrice": 22.63,
          "type": "Mid",
          "volume": 300,
          "tradeSum": 6789
        },
        ...
    

    Rather than using this fix, review the data generation code if possible to see if this nesting can be avoided.

    Note, that you easily can check the structure

    > str(updated_nested_lists)
    List of 1
     $ :List of 3
      ..$ startTimestamp: POSIXct[1:1], format: "2021-05-06 09:25:00"
      ..$ endTimestamp  : POSIXct[1:1], format: "2021-05-06 09:39:58"
      ..$ orders        :List of 61
      .. ..$ :Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 5 obs. of  5 variables:
      .. .. ..$ timestamp : POSIXct[1:5], format: "2021-05-06 09:25:00" "2021-05-06 09:25:00" "2021-05-06 09:25:00" ...
      .. .. ..$ tradePrice: num [1:5] 22.6 22.6 22.6 22.6 22.6
      .. .. ..$ type      : chr [1:5] "Mid" "Mid" "Mid" "Mid" ...
      .. .. ..$ volume    : int [1:5] 100 100 100 100 100
      .. .. ..$ tradeSum  : num [1:5] 2263 2263 2263 2263 2263
      .. ..$ :Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 1 obs. of  5 variables:
      .. .. ..$ timestamp : POSIXct[1:1], format: "2021-05-06 09:25:00"
      .. .. ..$ tradePrice: num 22.6