Search code examples
argumentsshebang

How can interpreter detect being called from a script as opposed to command line?


As "is known", a script my-script-file which starts with

#!/path/to/interpreter -arg1 val1 -arg2 val2

is executed by exec calling /path/to/interpreter with 2(!) arguments:

  1. -arg1 val1 -arg2 val2
  2. my-script-file

(and not, as one might naively expect, with 5 arguments

  1. -arg1
  2. val1
  3. -arg2
  4. val2
  5. my-script-file

as has been explained in many previous questions, e.g., https://stackoverflow.com/a/4304187/850781).

My problem is from the POV of an interpreter developer, not script writer.

How do I detect from inside the interpreter executable that I was called from shebang as opposed to the command line?

Then I will be able to decide whether I need to split my first argument by space to go from "-arg1 val1 -arg2 val2" to ["-arg1", "val1", "-arg2", "val2"] or not.

The main issue here is script files named with spaces in them. If I always split the 1st argument, I will fail like this:

$ my-interpreter "weird file name with spaces"
my-interpreter: "weird": No such file or directory

Solution

  • On Linux, with GNU libc or musl libc, you can use the aux-vector to distinguish the two cases.

    Here is some sample code:

    #define _GNU_SOURCE 1
    #include <stdio.h>
    #include <errno.h>
    #include <sys/auxv.h>
    #include <sys/stat.h>
    
    int
    main (int argc, char* argv[])
    {
      printf ("argv[0] = %s\n", argv[0]);
      /* https://www.gnu.org/software/libc/manual/html_node/Error-Messages.html */
      printf ("program_invocation_name = %s\n", program_invocation_name);
      /* http://man7.org/linux/man-pages/man3/getauxval.3.html */
      printf ("auxv[AT_EXECFN] = %s\n", (const char *) getauxval (AT_EXECFN));
      /* Determine whether the last two are the same. */
      struct stat statbuf1, statbuf2;
      if (stat (program_invocation_name, &statbuf1) >= 0
          && stat ((const char *) getauxval (AT_EXECFN), &statbuf2) >= 0)
        printf ("same? %d\n", statbuf1.st_dev == statbuf2.st_dev && statbuf1.st_ino == statbuf2.st_ino);
    }
    

    Result for a direct invocation:

    $ ./a.out 
    argv[0] = ./a.out
    program_invocation_name = ./a.out
    auxv[AT_EXECFN] = ./a.out
    same? 1
    

    Result for an invocation through a script that starts with #!/home/bruno/a.out:

    $ ./a.script 
    argv[0] = /home/bruno/a.out
    program_invocation_name = /home/bruno/a.out
    auxv[AT_EXECFN] = ./a.script
    same? 0
    

    This approach is, of course, highly unportable: Only Linux has the getauxv function. And there are surely cases where it does not work well.