I have made a function which mutates across columns and creates new named columns from each of them. The new colums are put to the right side of the dataframe whereas I would like to have them adjacent to each of the original columns. I am looking for a solution that will generalise to any dataframe this function might be used on, writing a select statement to reorder the columns is not automatic enough for my use case.
test_data <- data.frame(data_col_1 = c(1,2,3),
data_col_2 = c(1,2,3),
data_col_3 = c(1,2,3),
another_column = c("a","b","c"))
perc_funct <- function(df, columns, numerator){
p_f <- function(x, numerator){
(x/numerator)*100
}
j <- df %>%
mutate( across({{columns}},
.fns = list(perc = ~p_f(.x, numerator)),
.names = "{col}_{fn}"))# need to figure out a way to get the columns ordered
return(j)
}
test_data %>% perc_funct(columns = starts_with("data"), numerator = 1)
The output currently puts all the new colums to the right.
"data_col_1" "data_col_2" "data_col_3" "another_column" "data_col_1_perc" "data_col_2_perc" "data_col_3_perc"
The output I want puts each new colums to the right of each old column. "data_col_1" "data_col_1_perc" "data_col_2" "data_col_2_perc" "data_col_3" "data_col_3_perc" "another_column"
I typically sort the columns with select(sort(names(.)))
afterwards:
library(dplyr)
test_data %>%
perc_funct(columns = starts_with("data"), numerator = 1) %>%
select(sort(names(.)))
#> data_col_1 data_col_1_perc data_col_2 data_col_2_perc data_col_3
#> 1 1 100 1 100 1
#> 2 2 200 2 200 2
#> 3 3 300 3 300 3
#> data_col_3_perc
#> 1 100
#> 2 200
#> 3 300
Created on 2022-04-01 by the reprex package (v2.0.1)
It's just a matter of nesting my solution above together with other select
statements or dplyr verbs. You might have to save the dataframe with the unsorted columns as a intermediate step.
Here is an example with three other columns, where I want some to come first, some to come last, and others to come anywhere but stay together.
library(dplyr)
df <-
test_data %>%
mutate(first_col = 1, other_columns = 100, last_col = 999) %>%
perc_funct(columns = starts_with("data"), numerator = 1)
# Unsorted:
df %>% names()
#> [1] "data_col_1" "data_col_2" "data_col_3" "first_col"
#> [5] "other_columns" "last_col" "data_col_1_perc" "data_col_2_perc"
#> [9] "data_col_3_perc"
# Sorted:
df %>%
select(
first_col,
df %>% select(starts_with("data")) %>% names() %>% sort(),
everything(),
last_col
)
#> first_col data_col_1 data_col_1_perc data_col_2 data_col_2_perc data_col_3
#> 1 1 1 100 1 100 1
#> 2 1 2 200 2 200 2
#> 3 1 3 300 3 300 3
#> data_col_3_perc other_columns last_col
#> 1 100 100 999
#> 2 200 100 999
#> 3 300 100 999
Created on 2022-04-01 by the reprex package (v2.0.1)
There's also an alternative using col_bind()
:
If you just want your new columns last, but sorted together with the columns they were created from, you can also do something like:
library(dplyr)
df %>%
select(
-starts_with("data")
) %>% bind_cols(
df %>%
select(
df %>% select(starts_with("data")) %>% names() %>% sort()
)
)
#> first_col other_columns last_col data_col_1 data_col_1_perc data_col_2
#> 1 1 100 999 1 100 1
#> 2 1 100 999 2 200 2
#> 3 1 100 999 3 300 3
#> data_col_2_perc data_col_3 data_col_3_perc
#> 1 100 1 100
#> 2 200 2 200
#> 3 300 3 300