I have a bunch of dataframes that contain individual-level person data for complex survey weight adjusted survey data at a state level. Say one for each state:
df_1, df_2, ..., df_50
I have a function, calc_wt(data,age_min,age_max)
that takes an individual dataframe (like df_1), a minimum age, a maximum age and returns unweighted count and weighted means/SEs of the individual level data in that data.frame, when the dataset is subset to those within the minimum age and maximum age range.
What I want for each minimum and maximum age range is a results dataframe, where each row is equal to the aggregated data returned from calc_wt()
of df_1, df_2, ..., df_50
.
so I want something like:
rbind(calc_wt(df_1,age_min = 18, age_max = 84),
calc_wt(df_2,age_min = 18, age_max = 84),
....,
calc_wt(df_50,age_min = 18, age_max = 84))
But is there a way to do it without specifying each input dataframe exactly? Maybe something like purrr?
In base R: mget()
+ lapply()
+ do.call("rbind", ...)
df_list <- mget(ls(pattern="^df_[0-9]+$"))
cw_list <- lapply(df_list, calc_wt, age_min = 18, age_max = 84)
result <- do.call("rbind", cw_list)
You can do this in tidyverse too once you've got df_list()
with map()
+ list_rbind()
(or map_dfr()
, which I prefer but which tidyverse reports as being superseded ...)
It would be more robust to go upstream and get your df_*
objects as a list in the first place, rather than cluttering your workspace with them and then using mget()
to retrieve them (e.g. use map()
or lapply()
with read_csv()
(or whatever) and a character vector of file names ...)