Search code examples
rlinuxsshrmysql

SSH within R script to get to MySQL Database


I am trying to connect to a MySQL server, which is restricted by being connected to a given server. I am trying to connect through this restricting server while not physically connected.

Through the command line this is doable by creating a SSH connection, after which I can run MySQL commands from the command line. For example:

ssh myUsername@Hostname
myUsername@Hostname's password:
[myUsername@Host ~]$ mysql -h mySQLHost -u mySQLUsername -p mySQLPassword

However, I wish to connect to the MySQL database from within R, so I can send queries to read in tables into my current R session. Usually I would run a R session inside of the commandline, but the server does not have R installed on it.

For example, I have this snippet of code that work when I am physically connected to the server (filled in information changed):

myDB <- dbConnect(MySQL(), user="mySQLUsername", password="mySQLPassword", dbname="myDbname", host="mySQLHost")

In essence, I want to run this same command through a pipe, so that the myDB object is a working mySQL connection.

I have been trying to pipe my way into the restricting server from within R, and have been able to read in a csv file. For example:

dat <- read.table(pipe('ssh myUsername@Hostname "cat /path/to/your/file"'))

This prompts me for my password, and the table is read (as is suggested it would here). However, I am unsure how to translate this to a MySQL connection. For example, should I make the pipe part of the host argument? That was my first thought, but have been unable to make that work.

Any help would be appreciated.


Solution

  • I accomplish a similar task with Postgres using SSH tunneling. Effectively, what you're doing with an SSH tunnel is saying "establish a connection to the remote server, and make a port from that server available as a port on my local machine."

    You can set up a SSH tunnel using the following command on your local machine:

    ssh -L local_port:lochalhost:remote_port username@remote_host
    

    Specifically, what you're doing with this command is creating a Local Port Forwarding SSH tunnel, which is taking the port you'd connect to directly on the machine with your database installed (remote_port), and securely sending it to the machine you have R installed on as local_port.

    For example, for a database server with the following options:

    hostname: 192.168.1.3
    username: mysql
    server mysql port: 3306
    

    You could use the following command (at the command line, or in R using system2) to create a tunnel to port 9000 on your machine:

    ssh -L 9000:localhost:3306 mysql@192.168.1.3
    

    Depending on what your exact DBI connection looks like in R, you may have to edit the connection configuration slightly to make it connect to your newly created tunneling port. The reason why I use a different localhost port is that it prevents conflicts with a local version of the database, if you've got one.