Search code examples
rmongolite

Extract individual elements of Arrays produced by Mongolite from a Dataframe


I utilized Mongolite to download some MongoDB data. I stored it as a dataframe and then as a tibble as shown below:

aggregate = as.data.frame(aggregate)
aggregate = as_tibble(aggregate)

duplicates_removed = aggregate %>% distinct(plot, title, released, .keep_all = TRUE)

However, I now desire to extract individual elements from arrays. However, RStudio has converted these arrays into textual form. For example, an Array containing "Short" and "Western" has been turned into c("Short", "Western").

For the purposes of processing such as the usage of the count function, I need to be able to count individual elements, not combinations. How do I do this?

This is what I tried initially: count(duplicates_removed, vars = duplicates_removed$genres) Shown below are the results:

enter image description here


Solution

  • You have list-columns. While it would likely be better to do the counting right the first time, we can heal from your first call to count with:

    genres_table %>%
      tidyr::unnest(vars) %>%
      summarize(n = sum(n), .by = vars) # dplyr_1.1.0 or later
    # # A tibble: 7 × 2
    #   vars          n
    #   <chr>     <int>
    # 1 Short        91
    # 2 Western       2
    # 3 Drama       148
    # 4 Fantasy      18
    # 5 Animation    57
    # 6 Comedy       57
    # 7 History     143