Search code examples
rremote-accesssmb

Read remote file beginning with "smb://" using R


To read a file in R, I'd normally do something like the following:

read.csv('/Users/myusername/myfilename.csv')

But, I'm trying to read a file located on a remote server (Windows SMB/CIFS share) which I access on my Mac via the FinderGoConnect to Server menu item.

When I view that file's properties, the file path is different than what I'm used to. Instead of beginning with: /Users/myusername/..., it is smb://server.msu.edu/.../myfilename.csv.

Trying to read the file, I tried the following:

read.csv('smb://server.msu.edu/.../myfilename.csv')

But, this didn't work.

Instead of the usual "No such file or directory" error, this returned:

smb://server.msu.edu/.../myfilename.csv does not exist in current working directory

I imagine the file path needs a different format, but I can't figure what.

How can you read this type of file in R?


Solution

  • Explanation

    smb://educ-srvmedia1.campusad.msu.edu/... is actually a URL not a file path.

    Let's break this down

    smb:// means use the server message block protocol (file sharing)

    educ-srvmedia1.campusad.msu.edu is the name of the server

    /.../myfilename.csv is the file share/path on the remote server

    You are able to navigate to this directory using Finder on OSX because it has built in support for the SMB protocol. Finder connects to the remote service using the URL and allows you to browse the files.

    However R has no understanding of the SMB protocol so can't interpret the file path properly.

    The R function read.csv() uses file() internally, see https://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html

    url and file support URL schemes file://, http://, https:// and ftp://

    So R returns "unable to locate the file" message because the file cannot be found because the protocol is unsupported. yes, slightly confusing.

    Fix

    You need to mount the file share on your local filesystem.

    All this means is that the details of the SMB protocol will be handled behind the scenes by the OS and the fileshare will be presented as a local directory.

    This will allow R (and other programs) to treat the remote files for all intents and purposes, like any other local files. This discussion shows some options for doing so.

    e.g.

    # need to create /LocalFolder first
    mount -t cifs //username:password@hostname/sharename /LocalFolder
    

    then in R:

    read.csv('/LocalFolder/myfilename.csv')
    

    Extra

    Windows users can accomplish this easier with UNC paths
    How to read files from a UNC-specified directory in R?