I am trying to use the built-in R progress-bar (txtProgressBar
) with %%R
magic in Jupyter. While it does produce a nice animation when executed in the R console or RStudio, it does not produce the desired output in the Jupyter (notebook or lab) with an rpy2 extension instead, printing all the steps at once after finishing (which makes the progress bar useless). Two questions:
Here is a simple snippet of a progress bar from rfunction.com:
%%R
SEQ <- seq(1,100)
pb <- txtProgressBar(1, 100, style=3)
TIME <- Sys.time()
for(i in SEQ){
Sys.sleep(0.02)
setTxtProgressBar(pb, i)
}
For the folks new to rpy2
: It needs to be installed with pip install rpy2
and the magic needs to be loaded in Jupyter with %load_ext rpy2.ipython
.
Edit: The workaround I use for now is to manually invoke the code via robjects.r
:
from rpy2.robjects import r
r("""
SEQ <- seq(1,100)
pb <- txtProgressBar(1, 100, style=3)
TIME <- Sys.time()
for(i in SEQ){
Sys.sleep(0.02)
setTxtProgressBar(pb, i)
}
""")
however this is not ideal - I would prefer to keep all the benefits of the rpy2's Rmagic.
There should be a way to achieve this, as the R magic is calling robjects.r()
(as you are in your workaround).
In short, the following is happening when you submit an %%R
jupyter cell for evaluation.
%%R
line are evaluated and eventual setup prior to the evaluation of the R code is done (e.g., use a local converter, convert input parameters, etc...)%%R
cell is evaluated in the R "Global Environment" as a string of codeThe second step is a essentially a call to the R C API, which the GIL makes the only activity happening with that process. However, rpy2 is defining default callbacks that reroute R's printing to the terminal/console to Python's own print()
which is why you see the prints as the code is running in your call to robjects.r()
.
I am seeing that the R magic is caching the R output, and while there is an attribute cache_display_data
that should control this is it not used. This is bug, for the reason your are asking on Stackoverflow, and because an R code block printing a lot would use more memory than needed (and even exhaust all RAM). I do not know whether it has always be present or it was introduced during code refactoring; it is now tracked here: https://bitbucket.org/rpy2/rpy2/issues/543
Edit: The fix is now in the repository, and will be part of rpy2-3.0.3 (likely released today).