Search code examples
rrelocationrelocate

How to str_detect a pattern in the pipe to relocate columns


I am struggling to relocate the descriptive statistics columns close to the original columns

iris2 <- iris %>% dplyr::select(c(2:4)) %>% dplyr::mutate(across(.cols = where(is.numeric), .fns = list(
        n_miss = ~ sum(is.na(.x)),
        mean   = ~ mean(.x, na.rm = TRUE),
        median = ~ median(.x, na.rm = TRUE),
        min    = ~ min(.x, na.rm = TRUE),
        max    = ~ max(.x, na.rm = TRUE)
      )))

As you can see I have the descriptive columns generated after computing descriptive stats are located at the end of the dataframe, and I would like to be ordered.

So I have 2 questions related; I will paste some of the syntax (none of them works):

#First: how to pass the pattern using str_detect or grep to relocate descriptive columns

iris2 <- iris2 %>% dplyr::relocate(str_detect(colnames, "Sepal.Width"), .after =   Sepal.Width)

# Second: would it be possible to do it for all columns, passing the name in a vector 

iris2 <- iris2 %>% dplyr::relocate(str_detect(colnames, c("Sepal.Width", "Petal.Width"), .after = list(Sepal.Width,  Petal.Width)))

The expected output would be something like this for every column (I paste for Sepal.Width)

expected_output <- structure(list(Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6, 3.9), Sepal.Width_mean = c(3.05733333333333, 
3.05733333333333, 3.05733333333333, 3.05733333333333, 3.05733333333333, 
3.05733333333333), Sepal.Width_median = c(3, 3, 3, 3, 3, 3), 
    Sepal.Width_min = c(2, 2, 2, 2, 2, 2), Sepal.Width_max = c(4.4, 
    4.4, 4.4, 4.4, 4.4, 4.4), Petal.Length = c(1.4, 1.4, 1.3, 
    1.5, 1.4, 1.7)), row.names = c(NA, 6L), class = "data.frame")

I haven't tried the arrange function because I expected to perform these operations in the pipe, but if not possible I will do it in two steps

Thank you!


Solution

  • Use

    nms <- sub("_.*", "", names(iris2))
    iris2[, order(ordered(nms, unique(nms)))]
    

    If you want to pipe this use:

    iris2 %>%
       select(order(str_remove(names(.), "_.*") %>%
              ordered(unique(.))))
    

    And the structure looks as shown below:

    'data.frame':   150 obs. of  18 variables:
     $ Sepal.Width        : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
     $ Sepal.Width_n_miss : int  0 0 0 0 0 0 0 0 0 0 ...
     $ Sepal.Width_mean   : num  3.06 3.06 3.06 3.06 3.06 ...
     $ Sepal.Width_median : num  3 3 3 3 3 3 3 3 3 3 ...
     $ Sepal.Width_min    : num  2 2 2 2 2 2 2 2 2 2 ...
     $ Sepal.Width_max    : num  4.4 4.4 4.4 4.4 4.4 4.4 4.4 4.4 4.4 4.4 ...
     $ Petal.Length       : num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
     $ Petal.Length_n_miss: int  0 0 0 0 0 0 0 0 0 0 ...
     $ Petal.Length_mean  : num  3.76 3.76 3.76 3.76 3.76 ...
     $ Petal.Length_median: num  4.35 4.35 4.35 4.35 4.35 4.35 4.35 4.35 4.35 4.35 ...
     $ Petal.Length_min   : num  1 1 1 1 1 1 1 1 1 1 ...
     $ Petal.Length_max   : num  6.9 6.9 6.9 6.9 6.9 6.9 6.9 6.9 6.9 6.9 ...
     $ Petal.Width        : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
     $ Petal.Width_n_miss : int  0 0 0 0 0 0 0 0 0 0 ...
     $ Petal.Width_mean   : num  1.2 1.2 1.2 1.2 1.2 ...
     $ Petal.Width_median : num  1.3 1.3 1.3 1.3 1.3 1.3 1.3 1.3 1.3 1.3 ...
     $ Petal.Width_min    : num  0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 ...
     $ Petal.Width_max    : num  2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 ...