Search code examples
sqlrodbcdbidbplyr

How to Connect to SQL from R Studio


I use Microsoft SQL Server Management Studio on Windows 10 to connect to the following database and this is what the login screen looks like:

Server Type:      Database Engine
Server Name:      sqlmiprod.b298745190e.database.windows.net
Authentication:   SQL Server Authentication
Login:            my_user_id
Password:         my_password

This recent R Studio article offers an easy way to connect to SQL Servers from R Studio using the following:

con <- DBI::dbConnect(odbc::odbc(),
                      Driver   = "[your driver's name]",
                      Server   = "[your server's path]",
                      Database = "[your database's name]",
                      UID      = rstudioapi::askForPassword("Database user"),
                      PWD      = rstudioapi::askForPassword("Database password"),
                      Port     = 1433)

I have two questions

  1. What should I use as "[your driver's name]"?
  2. What should I use as "[your database's name]"?

The server path I'll use is sqlmiprod.b298745190e.database.windows.net (from above) and I'll leave the port at 1433. If that's wrong please let me know.


Solution

  • Driver

    From @Zaynul's comment and my own experience, the driver field is a text string with the name of the ODBC driver. This answer contains more details on this.

    You probably want someting like:

    • Driver = 'ODBC Driver 17 for SQL Server' (from @Zaynul's comment)
    • Driver = 'ODBC Driver 11 for SQL Server' (from my own context)

    Database

    The default database you want to connect to. Roughly equivalent to starting an SQL script with

    USE my_database
    GO
    

    If all your work will be within a single database then puts its name here.

    In some contexts you should be able to leave this blank, but you then have to use the in_schema command to add the database name every time you connect to a table.

    If you are working across multiple databases, I recommend putting the name of one database in, and then using the in_schema command to specify the database at every point of connection.

    Example using the in_schema command (more details):

    df = tbl(con, from = in_schema('database.schema', 'table'))
    

    Though I have not tried it, if you do not have a schema then

    df = tbl(con, from = in_schema('database', 'table'))
    

    Should also work (I've been using this hack without issue for a while).