Search code examples
bashiogrepwindows-subsystem-for-linux

How do i delay or buffer the stream of a programs output for a few seconds before it pipes to another program?


I want to run a program that will after a few seconds display a generated url in the stderr. I want to then take this url and pass it to my browser. I also want to leave the output of the terminal unchanged, so I use the tee command.

I have already solved all the parsing and piping path by taking the output into a file. But still need to figure out how to link it to the program itself.

michael@DESKTOP-OI3AOU6:~$ ./anaconda3/bin/jupyter lab ~ 2> 1.txt

michael@DESKTOP-OI3AOU6:~$ cat 1.txt
[I 12:02:11.619 NotebookApp] JupyterLab extension loaded from /home/michael/anaconda3/lib/python3.7/site-packages/jupyterlab
[I 12:02:11.620 NotebookApp] JupyterLab application directory is /home/michael/anaconda3/share/jupyter/lab
[I 12:02:11.622 NotebookApp] Serving notebooks from local directory: /home/michael/anaconda3/bin
[I 12:02:11.622 NotebookApp] The Jupyter Notebook is running at:
[I 12:02:11.622 NotebookApp] http://localhost:8888/?token=e48288141f435ebe3008ba9209d2c6d4f456a664bf6aed34
[I 12:02:11.622 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 12:02:11.631 NotebookApp]

    To access the notebook, open this file in a browser:
        file:///home/michael/.local/share/jupyter/runtime/nbserver-2069-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=e48288141f435ebe3008ba9209d2c6d4f456a664bf6aed34

Then I can pipe it through my chain like so:

michael@DESKTOP-OI3AOU6:~$ cat 1.txt > >(grep ^[[:blank:]].*http.* | tr -d " \t\n\r")
michael@DESKTOP-OI3AOU6:~$ http://localhost:8888/?token=e48288141f435ebe3008ba9209d2c6d4f456a664bf6aed34

And it works wonderfully to pipe the url into my browser with a custom profile set up:

cat 1.txt > >(grep ^[[:blank:]].*http.* | tr -d " \t\n\r" | xargs firefox.exe -P jupyterlab 2> /dev/null)

Putting it all together I get the behaviour I want with the browser launching and the error log showing with:

michael@DESKTOP-OI3AOU6:~$ cat 1.txt > >(tee >(grep ^[[:blank:]].*http.* | tr -d " \t\n\r"| xargs firefox.exe -P jupyterlab 2>/dev/null))

michael@DESKTOP-OI3AOU6:~$ [I 12:02:11.619 NotebookApp] JupyterLab extension loaded from /home/michael/anaconda3/lib/python3.7/site-packages/jupyterlab
[I 12:02:11.620 NotebookApp] JupyterLab application directory is /home/michael/anaconda3/share/jupyter/lab
[I 12:02:11.622 NotebookApp] Serving notebooks from local directory: /home/michael/anaconda3/bin
[I 12:02:11.622 NotebookApp] The Jupyter Notebook is running at:
[I 12:02:11.622 NotebookApp] http://localhost:8888/?token=e48288141f435ebe3008ba9209d2c6d4f456a664bf6aed34
[I 12:02:11.622 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 12:02:11.631 NotebookApp]

    To access the notebook, open this file in a browser:
        file:///home/michael/.local/share/jupyter/runtime/nbserver-2069-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=e48288141f435ebe3008ba9209d2c6d4f456a664bf6aed34

The problem occurs when I attatch my program to the pipe. It takes a few seconds to initialize the server and feed the output. The tr command ends up just pushing forward an empty string to the browser before the correct line is recieved by grep.

michael@DESKTOP-OI3AOU6:~$ ./anaconda3/bin/jupyter lab ~ 2> >(tee >(gre p ^[[:blank:]].*http.* | tr -d " \t\n\r"| xargs firefox.exe -P jupyterlab 2>/dev/null))

It works up to grep, but the url will only show after a few seconds once it loads:

michael@DESKTOP-OI3AOU6:~$ ./anaconda3/bin/jupyter lab ~ 2> >(grep ^[[:blank:]].*http.*)
        http://localhost:8888/?token=6988d45b9baa2e9f07c5a91a9a91457d6119e9884bdbcb10

Nothing shows up after I tr it.

michael@DESKTOP-OI3AOU6:~$ ./anaconda3/bin/jupyter lab ~ 2> >(grep ^[[:blank:]].*http.* | tr -d " \t\n\r")

How do I make the command (grep) wait a few seconds before it sends it to the next command in the stream chain (tr)?


Solution

  • I was able to solve the problem by making it simpler, and saving the output to a temp file instead of trying to pipe it all at once.

    michael@DESKTOP-OI3AOU6:~$ ~/anaconda3/bin/jupyter lab ~ 2> >(tee /tmp/jlab ) & sleep 4 ; cat /tmp/jlab | grep ^[[:blank:]].*http.* | tr -d " \t\n\r" | xargs firefox.exe -P jupyterlab ; rm /tmp/jlab; %

    Going through each section line by line for others reference:


    ~/anaconda3/bin/jupyter lab ~ Run a jupyter lab session in the home directory (~).


    2> Pipe the standard error into the file follows it.


    >() Allow pipes going into files to be directed into standard input of command enclosed.

    tee /tmp/jlab Redirect the input to a temporary file jlab and copy it to the standard output. This is how I retain the behavior of the original program showing the information in the terminal. More info https://en.wikipedia.org/wiki/Tee_(command)

    >(tee /tmp/jlab ) The output is piped into the tee command


    & Allow process to continue running in the background.


    sleep 4 Wait 4 seconds for the server to spin up.


    ; After command executes next command


    cat /tmp/jlab Catenate the contents of the temporary file /tmp/jlab into the standard output.


    | Pipe the standard output of the program to the left to the standard input of the pogram to the right. In this case cat /tmp/jlab into grep ^[[:blank:]].*http.*.


    grep ^[[:blank:]].*http.* Extracts the lines that have space a space in the beginning and has http contained on the line. It will allow any amount of charachters in between and after. In this case it workes out really nicely, but if by chance an update to jupyter changes the output this is where it would break and a more appropriate regex will be chosen.


    | output of grep piped to tr


    tr -d " \t\n\r" Removes all tabs spaces and line breaks from the line.


    | Pipes output of tr to xargs. This is the complete url unique to the jupyter session.


    xargs firefox.exe -P jupyterlab Xargs takes its standard input and feeds it as an argument to the following command.


    In this case firefox.exe which is a soft link which i stored in /usr/local/bin/firefox.exe the soft link points to the mounted windows location /mnt/c/ which resides at /mnt/c/Program Files/Mozilla Firefox/Firefox.exe. The reason i mount it this way is just my convention as windows executables have better rendering than programs executed in the WSL and ran through an xming.

    -P jupyterlab launches a profile I made which removes the tabs and navigation bar from firefox. I also accessed the customize option in firefox so the titlebar is showing.

    The profile is set up by setting a custom css in the specific profile directory %APPDATA%\Mozilla\Firefox\Profiles\

    The full path of the file is %APPDATA%\Mozilla\Firefox\Profiles\8vv7gs2r.jupyterlab\chrome\userChrome.css

    This file will set firefox so it will have no tabs or navigation cluttering the window.

    The contents of the file are as follows:

    /*
     * Do not remove the @namespace line -- it's required for correct functioning
     */
    @namespace url("http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul"); /* set default namespace to XUL */
    
    /*
     * Hide tab bar, navigation bar and scrollbars
     * !important may be added to force override, but not necessary
     * #content is not necessary to hide scroll bars
     */
    #TabsToolbar {visibility: collapse;}
    #navigator-toolbox {visibility: collapse;}
    

    ; After command executes next command


    rm /tmp/jlab Boiler plate line which remove the temporary file. It usually should be deleted on reset of the linux system, but it does not work in all implementations. I have not checked about whether wsl does it.


    ; After command executes next command


    % Moves the last job moved in the background with & to the foreground. The program now executes will now take interrupts as it did before it was moved to the background.