Search code examples
rdataframepurrrgt

How to map over function with multiple arguments (and changing data)?


I am creating a function to create gt tables for individual states within a data frame at the state-city level. I will be changing the data often as well as the selected columns in the gt, so I added a number of input arguments to the function call so that I can change it and use it accordingly. I want to use the purrr::map function as a pseudo for loop to iterate the function over data, while having the freedom to change the inputted data sources and selected columns within the gts.

The problem is that when I try to use the map function to get it to iterate over the data, I don't know how to change the map function to accommodate the multiple inputs.

How can I change the map function to do this? In this case, how can I use the function I've written to create an individual gt for each of the states in the example data? If there is an different/easier way of accomplishing this any suggestions are appreciated. Here's what I have tried:

library(gt)
library(tidyverse)

## WRITE THE FUNCTION
make_gts <- function(df, x, def_var, select_vect, title_text){
  
  df_ind <- df %>% filter(def_var == x)
  
  df_ind_clean <- df_ind %>% 
    select(all_of(select_vect)) 
    
  gt(df_ind_clean) %>% 
    tab_header(title = paste("This is a GT for", title_text))
    
  return(gt)
}

## DEFINE ARGUMENTS OF THE FUNCTION
title_text = "Elevation and NumObserved"
select_vect <- c("City", "Elevation", "NumObserved")
def_var <- df$State

## CREATE MAP LIST
iterate_list <- unique(def_var)

## GT LIST
state_gt_list <- set_names(iterate_list) %>% 
  purrr:map(make_gts(df = ex_data,
                     def_var = df$State, 
                     x = iterate_list[i], 
                     select_vect = select_vect,
                     title_text = title_text))

Sample data:

df <- structure(list(State = c("California", "California", "California", 
"Texas", "Texas", "Texas", "New Mexico", "New Mexico", "New Mexico"
), City = c("Los Angeles", "San Francisco", "Fresno", "Dallas", 
"Austin", "Frisco", "Albuquerque", "Santa Fe", "Taos"), NumObserved = c(1200000L, 
825000L, 113000L, 240000L, 189000L, 38000L, 56000L, 23000L, 6000L
), Elevation = c(28L, 47L, 235L, 312L, 550L, 128L, 4291L, 3533L, 
7823L)), class = "data.frame", row.names = c(NA, -9L))

Ideally the output would be a list containing a gt for each state: enter image description here enter image description here enter image description here


Solution

  • Expanding on my comment, and returning a standard tibble, since I don't have the gt package...

    df %>% 
      group_by(State) %>% 
      group_map(
        function(.x, .y, select_vect) {
          df_ind_clean <- .x %>% 
                  select(all_of(select_vect)) 
          # gt(df_ind_clean) %>% 
          #   tab_header(title = paste("This is a GT for", title_text))
          # return(gt)
          df_ind_clean
        },
        select_vect=c("City", "Elevation", "NumObserved")
      )
    

    gives

    [[1]]
    # A tibble: 3 × 3
      City          Elevation NumObserved
      <chr>             <int>       <int>
    1 Los Angeles          28     1200000
    2 San Francisco        47      825000
    3 Fresno              235      113000
    
    [[2]]
    # A tibble: 3 × 3
      City        Elevation NumObserved
      <chr>           <int>       <int>
    1 Albuquerque      4291       56000
    2 Santa Fe         3533       23000
    3 Taos             7823        6000
    
    [[3]]
    # A tibble: 3 × 3
      City   Elevation NumObserved
      <chr>      <int>       <int>
    1 Dallas       312      240000
    2 Austin       550      189000
    3 Frisco       128       38000
    

    Is that close to what you want?