Search code examples
c++waitexiterrnoqdebug

Wrong status when calling _exit(errno) from child


I'm calling execvp() with a deliberately wrong argument in a fork()'ed child. The errno number is properly set to ENOENT in the child process. I then terminate the child process with _exit(errno);.

My main process calls wait(). When I inspect the returned status with WIFEXITED and WEXITSTATUS I always get EINVAL for the first invocation. All other invocations return the correct ENOENT code.

I cannot explain this behavior. Below is the complete function, which does all of the above things, but a bit more complex.

QVariantMap
System::exec(const QString & prog, const QStringList & args)
{
  pid_t pid = fork();

  if (pid == 0) {
    int cargs_len = args.length() + 2;
    char * cargs[cargs_len];
    cargs[cargs_len - 1] = NULL;

    QByteArrayList as;
    as.push_back(prog.toLocal8Bit());

    std::transform(args.begin(), args.end(), std::back_inserter(as),
        [](const QString & s) { return s.toLocal8Bit(); });

    for (int i = 0; i < as.length(); ++i) {
      cargs[i] = as[i].data();
    }

    execvp(cargs[0], cargs);

    // in case execvp fails, terminate the child process immediately
    qDebug() << "(" << errno << ") " << strerror(errno);  // <----------
    _exit(errno);

  } else if (pid < 0) {
    goto fail;

  } else {

    sigset_t mask;
    sigset_t orig_mask;

    sigemptyset(&mask);
    sigaddset(&mask, SIGCHLD);

    if (sigprocmask(SIG_BLOCK, &mask, &orig_mask) < 0) {
      goto fail;
    }

    struct timespec timeout;
    timeout.tv_sec = 0;
    timeout.tv_nsec = 10 * 1000 * 1000;

    while (true) {
      int ret = sigtimedwait(&mask, NULL, &timeout);

      if (ret < 0) {
        if (errno == EAGAIN) {
          // timeout
          goto win;
        } else {
          // error
          goto fail;
        }

      } else {
        if (errno == EINTR) {
          // not SIGCHLD
          continue;
        } else {
          int status = 0;
          if (wait(&status) == pid) {
            if (WIFEXITED(status)) {
              return { { "error", strerror(WEXITSTATUS(status)) } };
            } else {
              goto fail;
            }
          } else {
            goto fail;
          }
        }
      }
    }
  }

win:
  return {};

fail:
  return { { "error", strerror(errno) } };
}

It turns out that removing the line with the qDebug() call makes the problem go away. Why does adding a debugging call change the behavior of the program?


Solution

  • qDebug() << "(" << errno << ") " << strerror(errno);
    _exit(errno);
    

    Pretty much any call to a standard library function can modify errno. It's likely that qDebug calls some I/O functions which set errno, or maybe even the << I/O operator. errno is not modified by most successful calls, but the higher level you get, the less you can know that there aren't some normal failed calls under the hood. So the value of errno that you're printing isn't the value of errno that you pass to _exit.

    As a general principle with errno, if you're doing anything more complex than just printing it out once, save the value to a variable before doing anything else.

    As already remarked in comments, note that most Unix systems including all the common ones only pass 8-bit values as exit statuses, but errno can be greater than 255. For example, if you run this program on a system where 256 is a possible error code, calling _exit(256) would result in the caller seeing a return code of 0, and thus incorrectly believing success.

    Generally it's enough to collapse all error values to success/failure. If you need to discriminate more than that, make sure that the information you're passing via exit/wait fits in the range 0–255.

    int exec_error = errno;
    qDebug() << "(" << exec_error << ") " << strerror(exec_error);
    _exit(!!exec_error);