Determine which binary will run via execlp in advance

Edit #1

The "Possible duplicates" so far are not duplicates. They test for the existence of $FILE in $PATH, rather than providing the full path to the first valid result; and the top answer uses bash command line commands, not pure c.

Original Question

Of all the exec family functions, there are a few which do $PATH lookups rather than requiring an absolute path to the binary to execute.

From man exec:

The execlp(), execvp(), and execvpe() functions duplicate the actions of the shell in searching for an executable file if the specified filename does not contain a slash (/) character. The file is sought in the colon-separated list of directory pathnames specified in the PATH environment variable. If this variable isn't defined, the path list defaults to the current directory followed by the list of directories returned by confstr(_CS_PATH). (This confstr(3) call typically returns the value "/bin:/usr/bin".)

Is there a simple, straightforward way, to test what the first "full path to execute" will evaluate to, without having to manually iterate through all the elements in the $PATH environment variable, and appending the binary name to the end of the path? I would like to use a "de facto standard" approach to estimating the binary to be run, rather than re-writing a task that has likely already been implemented several times over in the past.

I realize that this won't be a guarantee, since someone could potentially invalidate this check via a buggy script, TOCTOU attacks, etc. I just need a decent approximation for testing purposes.

Thank you.

Solution

Is there a simple, straightforward way, to test what the first "full path to execute" will evaluate to, without having to manually iterate through all the elements in the $PATH environment variable

No, you need to iterate thru $PATH (i.e. getenv("PATH") in C code). Some (non standard) libraries provide a way to do that, but it is really so simple that you should not bother. You could use strchr(3) to find the "next" occurrence of colon :, so coding that loop is really simple. As Jonathan Leffler commented, they are subtleties (e.g. permissions, hanging symbolic links, some other process adding some new executable to a directory mentionned in your $PATH) but most programs ignore them.

And what is really relevant is the PATH value before running execvp. In practice, it is the value of PATH when starting your program (because outside processes cannot change it). You just need to be sure that your program don't change PATH which is very likely (the corner case, and difficult one, would be some other thread -of the same process- changing the PATH environment variable with putenv(3) or setenv(3)).

In practice the PATH won't change (unless you have some code explicitly changing it). Even if you use proprietary libraries and don't have time to check their source code, you can expect PATH to stay the same in practice during execution of your process.

If you need some more precise thing, and assuming you use execp functions on program names which are compile time constants, or at least constant after your program initialization reading some configuration files, you could do what many shells are doing: "caching" the result of searching the PATH into some hash table, and using execve on that. Still, you cannot avoid the issue of some other process adding or removing files into directories mentioned in your PATH; but most programs don't care (and are written with the implicit hypothesis that this don't happen, or is notified to your program: look at the rehash builtin of zsh as an example).

But you always need to test against failure of exec (including execlp(3) & execve(2)) and fork functions. They could fail for many reasons, even if the PATH has not changed and directories and files mentioned in it have not been changed.