Search code examples
sqlrodbc

R: Renaming Multiple Variables


I am working with the R programming language. I am trying to rename all the variables in a data frame by adding "_a" to the end of each variable. I figured out how to do this with the "dplyr" library (for data frames in the global environment):

library(dplyr)
library(dbplyr)

var1 = rnorm(100,10,10)
var2 = rnorm(100,10,10)
var3 = rnorm(100,10,10)

my_data = data.frame(var1,var2,var3)

df = my_data %>% rename_all(paste0, "_a")

Problem: My actual data frame is on a database which I access using "RODBC SQL" commands, for example:

library(RODBC)
library(sqldf)

con = odbcConnect("some name", uid = "some id", pwd = "abc")

sample_query = sqlQuery(con, "select distinct * from df")

What I tried so far: Using the "dbplyr" library, I "extract" the SQL code performed above:

 con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")

df <- copy_to(con, my_data)

final =  df %>% 
 rename_all(paste0, "_a")  %>% show_query()

#results

<SQL>
SELECT `var1` AS `var1_a`, `var2` AS `var2_a`, `var3` AS `var3_a`
FROM `my_data`

My Question: The actual table I am using ("my_data") contains many columns and is on a database. I would like to rename all the columns (e.g. add "_a" at the end of each column), but I can not do this until I figure how to pass the SQL statement through sqlQuery() . This means that I am unable to get the entire list of columns all at once and rename them all at once.

I could manually do this, e.g.

sample_query = sqlQuery(con, "SELECT `var1` AS `var1_a`, `var2` AS `var2_a`, `var3` AS `var3_a` etc etc etc `varfinal` AS `varfinal_a`
    FROM `my_data` ")

But I am looking for a way to do this automatically.

Is it possible to write something like this ?

#not sure if this is correct
 sample_query = sqlQuery(con, "rename_all(paste0, "_a")") 

Thanks!


Solution

  • There may be better alternatives, but you could build a query string as follows:

    query <- paste(
      "SELECT",
      paste(
        names(my_data),
        "AS",
        paste0(names(my_data), "_a"),
        collapse = ", "
      ),
      "FROM my_data"
    )
    

    Then query will give you:

    [1] "SELECT var1 AS var1_a, var2 AS var2_a, var3 AS var3_a FROM my_data"
      
    

    Which can be used in your sqlQuery statement:

    sample_query = sqlQuery(con, query)