Search code examples
rdataframedplyrdata-analysis

How select columns with UTF-8 symbols in name in R?


I have db as a database and "Coś ktoś_był" as a column name in db. I tried this:

  temp_df <- db %>%
      select('Coś ktoś_był') 

but output:

Error: Can't subset columns that don't exist.
x Column `Cos ktos_byl` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.

I don't know how make it correct without change column name. I can't change it!


Solution

  • Try this:

    library(dplyr)
    temp_df <- db %>%
      select(matches("[^ -~]"))
    

    Alternatively, in base R:

    db[ , grepl("[^ -~]", names(db))]
    

    Both methods will select any column with non-ASCII characters in the name. If you need to be more specific, you can use something along these lines:

    temp_df <- df %>%
      select(matches("^Co[^ -~]"))