I have split up a dataframe into a list of sub-dataframes (sub_dfs) using dplyr::group_split. These
sub_dfs``` contain data which I then create a separate plot for each. However, I want to create these plots in a specific order.
Every sub_df
has a column called 'rank'. I want to create my plots in ascending order of:
(sub_df$rank)[1]
How can I achieve this?
I have tried to find an argument of dplyr::group_split
that allows me to control the ordering of the list it outputs. But I have had no luck.
I have also tried to concisely re-order the list after its creation. I can extract the relevant ranks from the list by:
extract_rank <- function(sub_df){as.numeric(return(sub_df[1,'rank'])}
ranking <- lapply(list, extract_rank)
...and ranking = c(10, 8, 3, 4, 6, 1...) tells me that the first sub_df
in list should actually be in the 10th position etc.
Can I somehow use ranking to re-order my list of sub_dfs
?
Make a factor
variable that you can split by and has its levels in the order you want. Here's a simple example with mtcars
:
mtcars |>
summarize(mpg = mean(mpg), .by = cyl) |>
mutate(cyl_split = factor(cyl, levels = c(4, 8, 6))) |>
group_split(cyl_split)
# <list_of<
# tbl_df<
# cyl : double
# mpg : double
# cyl_split: factor<37885>
# >
# [3]>
# [[1]]
# # A tibble: 1 × 3
# cyl mpg cyl_split
# <dbl> <dbl> <fct>
# 1 4 26.7 4
#
# [[2]]
# # A tibble: 1 × 3
# cyl mpg cyl_split
# <dbl> <dbl> <fct>
# 1 8 15.1 8
#
# [[3]]
# # A tibble: 1 × 3
# cyl mpg cyl_split
# <dbl> <dbl> <fct>
# 1 6 19.7 6
You can order the factor levels manually, or use a function like reorder
to order them based on a function of another column.
Note that it is not sufficient to order the rows of the data frame with arrange
, you need to order the levels of the factor
column you split by so that they will be in the order you want when they are sorted.