Search code examples
rfunctiondplyracrossrelocate

How can I relocate mutated columns next to the original colums?


I have made a function which mutates across columns and creates new named columns from each of them. The new colums are put to the right side of the dataframe whereas I would like to have them adjacent to each of the original columns. I am looking for a solution that will generalise to any dataframe this function might be used on, writing a select statement to reorder the columns is not automatic enough for my use case.

test_data <- data.frame(data_col_1 = c(1,2,3),
                        data_col_2 = c(1,2,3),
                        data_col_3 = c(1,2,3),
                        another_column = c("a","b","c"))


perc_funct <- function(df, columns, numerator){
  
p_f <- function(x, numerator){
  
  (x/numerator)*100
}
    j <- df %>%
     mutate( across({{columns}}, 
                    .fns = list(perc = ~p_f(.x, numerator)),
                    .names = "{col}_{fn}"))# need to figure out a way to get the columns ordered 
return(j)
}

test_data %>% perc_funct(columns = starts_with("data"), numerator = 1)

The output currently puts all the new colums to the right.

"data_col_1" "data_col_2" "data_col_3" "another_column" "data_col_1_perc" "data_col_2_perc" "data_col_3_perc"

The output I want puts each new colums to the right of each old column. "data_col_1" "data_col_1_perc" "data_col_2" "data_col_2_perc" "data_col_3" "data_col_3_perc" "another_column"


Solution

  • I typically sort the columns with select(sort(names(.))) afterwards:

    library(dplyr)
    
    test_data %>% 
      perc_funct(columns = starts_with("data"), numerator = 1) %>% 
      select(sort(names(.)))
    
    #>   data_col_1 data_col_1_perc data_col_2 data_col_2_perc data_col_3
    #> 1          1             100          1             100          1
    #> 2          2             200          2             200          2
    #> 3          3             300          3             300          3
    #>   data_col_3_perc
    #> 1             100
    #> 2             200
    #> 3             300
    

    Created on 2022-04-01 by the reprex package (v2.0.1)

    What if I have other columns I want to keep in the same spot?

    It's just a matter of nesting my solution above together with other select statements or dplyr verbs. You might have to save the dataframe with the unsorted columns as a intermediate step.

    Example 1

    Here is an example with three other columns, where I want some to come first, some to come last, and others to come anywhere but stay together.

    library(dplyr)
    
    df <- 
      test_data %>% 
      mutate(first_col = 1, other_columns = 100, last_col = 999) %>%
      perc_funct(columns = starts_with("data"), numerator = 1)
    
    # Unsorted:
    df %>% names()
    #> [1] "data_col_1"      "data_col_2"      "data_col_3"      "first_col"      
    #> [5] "other_columns"   "last_col"        "data_col_1_perc" "data_col_2_perc"
    #> [9] "data_col_3_perc"
    
    # Sorted:
    df %>% 
      select(
        first_col,
        df %>% select(starts_with("data")) %>% names() %>% sort(), 
        everything(),
        last_col
      ) 
    #>   first_col data_col_1 data_col_1_perc data_col_2 data_col_2_perc data_col_3
    #> 1         1          1             100          1             100          1
    #> 2         1          2             200          2             200          2
    #> 3         1          3             300          3             300          3
    #>   data_col_3_perc other_columns last_col
    #> 1             100           100      999
    #> 2             200           100      999
    #> 3             300           100      999
    

    Created on 2022-04-01 by the reprex package (v2.0.1)

    Example 2

    There's also an alternative using col_bind():

    If you just want your new columns last, but sorted together with the columns they were created from, you can also do something like:

    library(dplyr)
    df %>% 
      select(
        -starts_with("data")
      ) %>% bind_cols(
        df %>% 
          select(
            df %>% select(starts_with("data")) %>% names() %>% sort()
          )
      )
    #>   first_col other_columns last_col data_col_1 data_col_1_perc data_col_2
    #> 1         1           100      999          1             100          1
    #> 2         1           100      999          2             200          2
    #> 3         1           100      999          3             300          3
    #>   data_col_2_perc data_col_3 data_col_3_perc
    #> 1             100          1             100
    #> 2             200          2             200
    #> 3             300          3             300