Search code examples
pythonc++shelllsf

How to stream the interactive shell of a remote program to stdout of a running c++ program that launched the remote program (using BSUB -I)


I have a C++ program (say, process P1) which, in its course of execution, spawns a new process P2 on a remote machine using a launcher like LSF. P2 has an interactive shell (could be python). I want the user at P1 to use this interactive shell of P2 for a while and then exit P2 when done. P1 continues from here and may spawn other interactive shells in future. All the while P1 either continued in the background or was blocked (does not matter at the moment). It is necessary that a local program like P1 only spawns P2 as it could spawn other processes based on certain conditions. Also, P1 could relaunch P2 in the event of P2 crashing. All the processes are running on Linux environment.
Launching bsub -Ip P2 using popen does not stream the shell of P2 to P1's stdout. It just shows that the program was started on a certain machine.
If the streaming is not possible, is there an alternate way to handle such a scenario.

On Linux shell, to get an interactive Python shell launched on a remote machine with bsub, I use following:

$ bsub -Ip python
Job <625381> is submitted to default queue <interq>.
<<Waiting for dispatch ...>>
<<Starting on 1i-10-144>>
Python 3.12.0 (main, Nov 26 2023, 21:52:55) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> print("Hi")
Hi
>>> 

I want to get a similar interface from my C++ program. Here is a simple program to show what I want to do.

//  Program for P1
#include <iostream>

int main()
{
   /* do P1 work */

   char buffer[128];
   FILE* pipe = popen("bsub -Ip python", "r");   // P2 is python here.
   if (!pipe) {
      return 0;
   }   
   std::string result = ""; 
   while (!feof(pipe)) {
      if (fgets(buffer, 128, pipe) != NULL)
         result += buffer;
   }   
   pclose(pipe);

   /* continue P1 work */ 
  return 0;
}

On running this program I do not see the python shell on the stdout. This is expected as I have not done anything to redirect it.

$ ./a.out 
<<Waiting for dispatch ...>>
<<Starting on 1i-38-204>>

Since bsub has launched the program in interactive mode on the remote machine, the output must already be streamed to my machine. But how to access this stream through my program and redirect it to the stdout of my program P1?


Solution

  • Many programs detect whether the stdout is attached to a tty or not, and may behave differently depending on the result. For example, if sending output to a pipe, the program may disable interactive mode, or the output may be block-buffered (will not produce any output until a certain number of bytes have been written). Furthermore, it's worth noting that Python specifically sends parts of its interactive prompt to stderr rather than stdout, so you may not be able to capture it at all.

    For simple cases, you can use the system library call. This will call the target program using the shell (a la /bin/sh -c "system argument"), pass its stdin, stdout and stderr through to your terminal, and wait for the process to exit, returning its exit code. In your example, this would be system("bsub -Ip python");

    Note that system does come with some caveats:

    • Never pass untrusted user input to the argument of system (or popen), as the input is directly used as a shell command, and any failure to carefully escape the input can result in arbitrary code execution and compromise.
    • Never run programs in an untrusted environment; both system and popen implicitly use $PATH to find programs (as they hand the input string off to the shell), and an untrusted PATH variable could result in compromise.

    For more complex cases, on Linux the usual answer is to fork/exec/wait. You call fork to create a subprocess, then the subprocess does any necessary configuration (e.g. redirecting stdin/stdout/stderr, configuring signals, etc.) before calling one of the exec* functions (e.g. execve to control both the command line and environment). The parent process uses wait to wait for the subprocess to exit.

    This has the advantage of being the most flexible way to run a program, as it is essentially what the other functions (popen, system, etc.) do under the hood, and it lets you fully control all the details - for example, whether to actually wait for the subprocess or not, whether to redirect stdout/stderr/stdin, what the process environment should be, etc.

    Here's a simple example that just passes the environment from the parent process:

    #include <unistd.h>
    #include <stdlib.h>
    
    int run() {
        pid_t pid = fork();
        if (pid == -1) {
            perror("fork");
            return -1;
        } else if (pid > 0) {
            /* parent - wait for child process to exit */
            int status;
            waitpid(pid, &status, 0);
            // can check status to see if child exited cleanly, etc.
            return status;
        } else {
            /* child */
            // can now use dup2() to redirect stdin/stdout etc.
            // note that execlp uses PATH to search for the binary;
            // for safety you probably want execl() with the full path to your program
            // first argument is program to run; subsequent arguments are argv[]
            execlp("bsub", "bsub", "-Ip", "python", NULL);
            // exec should not return; if it does, that means it could not run the program
            perror("execl");
            _exit(1);
        }
    }