Search code examples
linuxbashshellscriptingksh

Sub-shell differences between bash and ksh


I always believed that a sub-shell was not a child process, but another shell environment in the same process.

I use a basic set of built-ins:

(echo "Hello";read)

On another terminal:

ps -t pts/0
  PID TTY          TIME CMD
20104 pts/0    00:00:00 ksh

So, no child process in kornShell (ksh).

Enter bash, it appears to behave differently, given the same command:

  PID TTY          TIME CMD
 3458 pts/0    00:00:00 bash
20067 pts/0    00:00:00 bash

So, a child process in bash.
From reading the man pages for bash, it is obvious that another process is created for a sub-shell, however it fakes $$, which is sneeky.

Is this difference between bash and ksh expected, or am I reading the symptoms incorrectly?

Edit: additional information: Running strace -f on bash and ksh on Linux shows that bash calls clone twice for the sample command (it does not call fork). So bash might be using threads (I tried ltrace but it core dumped!). KornShell calls neither fork, vfork, nor clone.


Solution

  • ksh93 works unusually hard to avoid subshells. Part of the reason is the avoidance of stdio and extensive use of sfio which allows builtins to communicate directly. Another reason is ksh can in theory have so many builtins. If built with SHOPT_CMDLIB_DIR, all of the cmdlib builtins are included and enabled by default. I can't give a comprehensive list of places where subshells are avoided, but it's typically in situations where only builtins are used, and where there are no redirects.

    #!/usr/bin/env ksh
    
    # doCompat arr
    # "arr" is an indexed array name to be assigned an index corresponding to the detected shell.
    # 0 = Bash, 1 = Ksh93, 2 = mksh
    function doCompat {
        ${1:+:} return 1
        if [[ ${BASH_VERSION+_} ]]; then
            shopt -s lastpipe extglob
            eval "${1}[0]="
        else
            case "${BASH_VERSINFO[*]-${!KSH_VERSION}}" in
                .sh.version)
                    nameref v=$1
                    v[1]=
                    if builtin pids; then
                        function BASHPID.get { .sh.value=$(pids -f '%(pid)d'); }
                    elif [[ -r /proc/self/stat ]]; then
                        function BASHPID.get { read -r .sh.value _ </proc/self/stat; }
                    else
                        function BASHPID.get { .sh.value=$(exec sh -c 'echo $PPID'); }
                    fi 2>/dev/null
                    ;;
                KSH_VERSION)
                    nameref "_${1}=$1"
                    eval "_${1}[2]="
                    ;&
                *)
                    if [[ ! ${BASHPID+_} ]]; then
                        echo 'BASHPID requires Bash, ksh93, or mksh >= R41' >&2
                        return 1
                    fi
            esac
        fi
    }
    
    function main {
        typeset -a myShell
        doCompat myShell || exit 1 # stripped-down compat function.
        typeset x
    
        print -v .sh.version
        x=$(print -nv BASHPID; print -nr " $$"); print -r "$x" # comsubs are free for builtins with no redirections 
        _=$({ print -nv BASHPID; print -r " $$"; } >&2)        # but not with a redirect
        _=$({ printf '%s ' "$BASHPID" $$; } >&2); echo         # nor for expansions with a redirect
        _=$(printf '%s ' "$BASHPID" $$ >&2); echo # but if expansions aren't redirected, they occur in the same process.
        _=${ { print -nv BASHPID; print -r " $$"; } >&2; }     # However, ${ ;} is always subshell-free (obviously).
        ( printf '%s ' "$BASHPID" $$ ); echo                   # Basically the same rules apply to ( )
        read -r x _ <<<$(</proc/self/stat); print -r "$x $$"   # These are free in {{m,}k,z}sh. Only Bash forks for this.
        printf '%s ' "$BASHPID" $$ | cat # Sadly, pipes always fork. It isn't possible to precisely mimic "printf -v".
        echo
    } 2>&1
    
    main "$@"
    

    out:

    Version AJM 93v- 2013-02-22
    31732 31732
    31735 31732
    31736 31732 
    31732 31732 
    31732 31732
    31732 31732 
    31732 31732
    31738 31732
    

    Another neat consequence of all this internal I/O handling is some buffering issues just go away. Here's a funny example of reading lines with tee and head builtins (don't try this in any other shell).

     $ ksh -s <<\EOF
    integer -a x
    builtin head tee
    printf %s\\n {1..10} |
        while head -n 1 | [[ ${ { x+=("$(tee /dev/fd/{3,4})"); } 3>&1; } ]] 4>&1; do
            print -r -- "${x[@]}"
        done
    EOF
    1
    0 1
    2
    0 1 2
    3
    0 1 2 3
    4
    0 1 2 3 4
    5
    0 1 2 3 4 5
    6
    0 1 2 3 4 5 6
    7
    0 1 2 3 4 5 6 7
    8
    0 1 2 3 4 5 6 7 8
    9
    0 1 2 3 4 5 6 7 8 9
    10
    0 1 2 3 4 5 6 7 8 9 10