Search code examples
solarisgnu-makenfsrealpath

gmake: a target completes but $(realpath ...) doesn't find it


Environment stuff:

  • Solaris NFS file servers running NFS 3
  • Errors occur in Linux or Solaris environments
  • Using GNU Make 3.82
  • Using Sun Studio compilers, if that matters

This is a vastly simplified example of the build I'm looking at:

all: ${list of shared objects to build}
  @do whatever

lib1.so: ${1s objects}
lib2.so: ${2s objects}
lib3.so: ${3s objects}
#...

%.so:
  $(call CHECK_DEPENDENCIES_EXIST)
  @${LD} ${LDFLAGS} ${^} -o ${@}

%.o : %.c
  @do stuff

%.o : %.cc
  @do stuff

define CHECK_DEPENDENCIES_EXIST =
$(if $(realpath ${^}),,$(warning No dependencies specified for ${@})false)
endef

The short & sweet: $(realpath x y z) (x/y/z get returned if they exist; returns an absolute path including no symlinks) is removing files from the list under some circumstances, and I think it has to do with NFS. It isn't predictable which target will fail. Sometimes a target will fail when it's succeeded the last 10 times. If I take @false out of the macro, the build continues without error -- that is, the linker does not complain about the supposedly missing file(s).

I'll spare you the drawn-out explanation; suffice it to say, the macro is helpful for debugging.


Solution

  • Turns out there's a bug in gmake. From the GNU Make 3.82 source, function.c, on or about line 2026:

    while ((path = find_next_token (&p, &len)) != 0 ) {
    /* ... */
        if (
    #ifdef HAVE_REALPATH
             realpath (in, out)
    #else
             abspath (in, out) && stat (out, &st) == 0
    #endif
           )
        {
          /* ... */
        }
      }
    }
    /* ... */
    

    Ocasionally, various calls to realpath would get interrupted (EINTR); nothing in this code checks errno, so it just silently fails.

    So, it wasn't that the file didn't exist, it was that $(realpath ...) was being interrupted by a signal (presumably a child instance of gmake signaling its completion or something similar) and this function wasn't designed to recover from that sort of event.

    To fix the problem:

    while ((path = find_next_token (&p, &len)) != 0 ) {
    

    ... becomes:

    while ( errno == EINTR || (path = find_next_token (&p, &len)) != 0 ) {
    

    The || will shortcut & prevent it from marching on to the next token.