Search code examples
rpipelatexcommand-line-interfacesystem

How can I capture CLI tool file output to R object or stdout?


I'm calling a bunch of command-line interface (CLI) tools (such as texi2pdf or pdf2svg from an R script, and I'd like to capture the output file of these tools directly as an R object, without touching the file system.

This is the opposite concern of the more frequent "how-do-I-redirect-stdout-to-file"-question. (Perhaps that implies that I'm "using it wrong").

Example:

Say, I have a simple latex reprex.tex file that I'd like to compile:

\documentclass{article}

\begin{document}

Foo.


\end{document}

In R, I can use the tools::texi2pdf() wrapper, or I can send the command myself to compile this to a pdf.

On the shell, simply:

texi2pdf reprex.tex

Equivalently, called from R:

reprex_as_pdf <- system2(command = "texi2pdf", args = c(reprex.tex), stdout = TRUE, stderr = TRUE)

Conveniently, system2() allows me to capture stdout/stderr as a character vector via stdout = TRUE, which gets me half of the way.

However, I can't seem to find anything in the texi2pdf manpage that would allow me to redirect the (binary!) output to stdout (and that way, to system2()).

How can I capture the output of texi2pdf directly in R as a (raw) vector?

(Bonus: how can I also pass the input to texi2pdf as an R character vector, rather than as a file?)


Work-Around

I can of course work via a tempfile(), but that would touch the filesystem, and just seems inelegant / cumbersome.

library(readr)
system2(command = "texi2pdf",
        args = c("reprex.tex"),
        stdout = TRUE,
        stderr = TRUE)
reprex_as_pdf <- read_file_raw("reprex.pdf")

Why would anyone want to do this, you ask?

I'm generally scared of side-effects and cross-machine/OS file system shenanigans, and want to isolate the side effects to very few functions. Also, the pdf will actually be programmatically exported, converted in all sorts of different functions. Lastly, I need a lot of these pdfs, and I want them easily compiled and cached before deploying to a server which may not have texi2dvi.

Please stop me if I am just absolutely "using it wrong".


Solution

  • Put simply: in the general case you can’t. But sometimes these tools allow you to specify an output file, in which case you can (on some systems, but note that this is not portable) specify /dev/stdout as the output file.

    According to the texi2pdf manpage, the following should therefore work:

    reprex_as_pdf <- system2(
        command = "texi2pdf",
        args = c("reprex.tex", "-o", "/dev/stdout"),
        stdout = TRUE, stderr = TRUE
    )
    

    However, this doesn’t prevent the tool from touching the file system in other ways (creating temporary files etc). There’s no way to prevent this, nor would it be desirable: these effects should be transparent to the user. The exception is the generation of multiple output files (e.g. log files), which TeX related tools unfortunately do prolifically.

    To answer your bonus question: this is once again not possible in a platform independent way but on POSIX systems you can create a named pipe. However, for all intents and purposes from an R perspective this behaves like an ordinary file, and it’s managed by the file system.