R: Tidyverse selection semantics tidyselect::eval_select appending numbers to duplicates

I am trying for some time to understand tidyverse design and how to program with it. I was trying to write a function that uses tidyselect semantics, and I found that tidyselect::eval_select appends numbers to lhs expressions. This was not surprising seeing that this sematic is used for column renaming. Unfortunately, my function meant for building a data structure doesn't need this behavior, it needs the regular name provided in lhs of the expression (duplicated as many times as necessary). I haven't managed to find out where this behavior is even coming from; it seems to be a make.unique but I can't find where it is implemented. If you know, I am quite curious to learn, if not, solving my problem shouldn't depend on it. All I want is for the lhs names to not have appended numbers, as in the example:

library(tidyverse)

# Data
data <- mtcars[, 8:11]

# Example
data %>%
  tidyselect::eval_select(rlang::expr(c(foo = 1, bar = c(2:4), foobar = c(1, "am", "gear", "carb"))), .)
#>     foo    bar1    bar2    bar3 foobar1 foobar2 foobar3 foobar4 
#>       1       2       3       4       1       2       3       4

# Function
test <- function(.data, ...) {
  loc <- tidyselect::eval_select(rlang::expr(c(...)), .data)
  names <- names(.data)
  list(names(loc), names[loc])
}

data %>%
  test(foo = 1, bar = c(2:4), foobar = c(1, "am", "gear", "carb"))
#> [[1]]
#> [1] "foo"     "bar1"    "bar2"    "bar3"    "foobar1" "foobar2" "foobar3"
#> [8] "foobar4"
#> 
#> [[2]]
#> [1] "vs"   "am"   "gear" "carb" "vs"   "am"   "gear" "carb"

^{Created on 2021-05-22 by the reprex package (v2.0.0)}

Desired output:

#> [[1]]
#> [1] "foo"     "bar"    "bar"    "bar"    "foobar" "foobar" "foobar"
#> [8] "foobar"
#> 
#> [[2]]
#> [1] "vs"   "am"   "gear" "carb" "vs"   "am"   "gear" "carb"

Any help is greatly appreciated.

Solution

The problem is caused by a function called ensure_named deeply nested inside eval_selects implementation. It is part pf the vars_select_eval function.

ensure_named(pos, vars, uniquely_named, allow_rename)

The good news is that we just need to overwrite the uniquely_named argument and this argument is carried on from the first implementation function called eval_select_impl which is called by eval_select itself. So all we need to do is to rewrite tidyselect::eval_select.

To get the wanted output we need to do two things:

Add uniquely_named = NULL as argument and specify it with FALSE when calling the function
Specify the existing argument name_spec = "{outer}". Doing only this step will not suffice unless uniquely_named is set to FALSE.

Before the actual code, a note of caution:

tidyselect::eval_select does on purpose not allow duplicate column names.

For starters, it is not possible to easily create a tibble with duplicate column names:

tibble(a = 1:3, b = 4:6, a = 7:9)
#> Error: Column name `a` must not be duplicated.
#> Use .name_repair to specify repair.

One workaround is to use a list with tibble::new_tibble:

tibble::new_tibble(list(a = 1:3, b = 4:6, a = 7:9), nrow = 3)
#> # A tibble: 3 x 3
#>       a     b     a
#>   <int> <int> <int>
#> 1     1     4     7
#> 2     2     5     8
#> 3     3     6     9

For a data.frame it is only possible to create non-unique names, when the check.names argument is set to FALSE:

data.frame(a = 1:3, b = 4:6, a = 7:9, check.names = FALSE)
#>   a b a
#> 1 1 4 7
#> 2 2 5 8
#> 3 3 6 9

But when we use this data.frame with regular {dplyr} verbs, an error will be thrown, telling us that we cannot transform data frames with duplicate names:

data.frame(a = 1:3, b = 4:6, a = 7:9, check.names = FALSE) %>% 
  mutate(c = 1:3)
#> Error: Can't transform a data frame with duplicate names.

So from this we can assume that it is not recommended to use data.frames with duplicate names in the {tidyverse}. It probably contradicts the notion of tidy data.

This being said, below is the above mentioned approach to solve this problem:

library(tidyverse)

# Data
data <- mtcars[, 8:11]

# custom eval_select function
my_eval_select <- function(expr, data,
                           env = rlang::caller_env(),
                           ..., include = NULL, 
                           exclude = NULL, strict = TRUE,
                           name_spec = NULL,
                           uniquely_named = NULL, # this is the new argument
                           allow_rename = TRUE) {
  ellipsis::check_dots_empty()
  tidyselect:::eval_select_impl(data, names(data), rlang::as_quosure(expr, env), 
                   include = include, exclude = exclude, strict = strict, 
                   name_spec = name_spec, allow_rename = allow_rename,
                   uniquely_named = uniquely_named) # which we also add here
}

# example 1
data %>%
  my_eval_select(rlang::expr(c(foo = 1, bar = c(2:4), foobar = c(1, "am", "gear", "carb"))),
                          data = .,
                          name_spec = "{outer}",  # we need to specify this
                          uniquely_named = FALSE) # and this
#>    foo    bar    bar    bar foobar foobar foobar foobar 
#>      1      2      3      4      1      2      3      4

# example: custom function
test <- function(.data, ...) {
  loc <- my_eval_select(rlang::expr(c(...)),
                        data = .data,
                        name_spec = "{outer}",
                        uniquely_named = FALSE)
  names <- names(.data)
  list(names(loc), names[loc])
}

# test
data %>%
  test(foo = 1, bar = c(2:4), foobar = c(1, "am", "gear", "carb"))
#> [[1]]
#> [1] "foo"    "bar"    "bar"    "bar"    "foobar" "foobar" "foobar" "foobar"
#> 
#> [[2]]
#> [1] "vs"   "am"   "gear" "carb" "vs"   "am"   "gear" "carb"

^{Created on 2021-05-22 by the reprex package (v0.3.0)}