Search code examples
r

How to get the absolute path from the context of the current running R script?


Using python, if I need the absolute path from the context of the current running script all I need to do is to add the following in the code of that script:

import os
os.path.abspath(__file__)

This is very useful as having the absolute path I can then use os.path.join to form new absolute paths for my project components (inside the project directory tree) and more interesting is that everything will continue to work without any problem no matter where the package directory is moved.

I need to achieve the very same thing using R programming, that is obtaining the absolute path of the current running R script ( = the absolute path of its file on the disk). But trying to do the same in R turns out to be quite challenging, at least for me as a rather beginner in R.

After a lot of googling, I tried to use the reticulate package to call Python from R but __file__ is not available there, then I found a few threads on Stackoverflow suggesting to play with the running Stack and others suggesting the use of normalizePath. However none of these worked for me when the entire project package is transferred from one directory to another.

Therefore, I would like to know if for example you have the following file/directory tree

base_dir ( = /home/usr1/apps/R/base_dir)
|
|
|___ myscript.R (this is my R script to be run)
|___ data (this is a directory)
|___ sql  (this is a directory)

Is there any solution allowing to add something in the code of myscript.R so that inside the script the program can always know that the base directory is /home/usr1/apps/R/base_dir and if later this base directory is moved to another directory then there is no need to change the code and the program would be able to find correctly the new base directory?


Solution

  • R has in general no way of finding this path, because there is no equivalent to Python’s __file__ in R.

    The closest you can get is to look at commandArgs() and laboriously extract the script filename (which requires different handling depending on how the script was launched!). But this will fail if the script was executed in RStudio, and it will fail after calling setwd().

    Other solutions (such as the ‘here’ package) rely on heuristics and specific project structures.

    But luckily there’s actually a solution that will always work: use ‘box’ modules.

    With modules, you’ll always be able to get the path of the current script/module via box::file(). This is the closest equivalent to Python’s __file__ you’ll get in R, and it always works — as long as you’re using ‘box’ modules consistently.

    (Internally the ‘box’ package requires complex logic to determine the value of the file() function in all circumstances; I don’t recommend replicating it, it’s too complex. For the curious, the bulk of the relevant logic is in R/loaded.r.)