Why do builds for various projects fail with ‘Operation not permitted’ using iOS on-device compiler/toolchain?

I am an intermediately skilled Linux/Unix user trying to compile software for an iPad on a (jailbroken) iPad.

Many builds (for example, make and tex-live) fail with some Operation not permitted error. This will either look like Can't exec "blah": Operation not permitted or execvp: blah: Operation not permitted where blah is aclocal, a configure script, libtool, or just about anything. Curiously, finding the offending line in a Makefile or configure script and prefixing it with sudo -u mobile -E will solve the error for that line, only for it to reappear for on a later line or in another file. Since I am running the build scripts as mobile, I do not understand how this could possibly fix the issue, yet it does. I have confirmed that making these changes does actually allow for the script to work successfully up to that point. Running the build script with sudo or sudo -u mobile -E and/or running the entire build as root does not solve the issue; with either, I still must edit build scripts to add sudo’s.

I would like to know why this is happening, and if possible how I could address the issue without editing build scripts. Any information about these types of errors would be interesting to me even if they do not solve my problem. I am aware that the permissions/security/entitlements system is unusual on iOS and would like to learn more about how it works.

I am using an iPad Pro 4 on jailbroken iOS 13.5 with the build tools from sbingner’s and MCApollo’s repos (repo.bingner.com and mcapollo.github.io/Public). In particular, I am using a build of LLVM 5 (manually installed from sbingner’s old debs), Clang 10, Darwin CC tools 927 and GNU Make 4.2.1. I have set CC, CXX, CFLAGS, etc. to point to clang-10 and my iOS 13.5 SDK with -isysroot and have confirmed that these settings are working. I would like to replace these with updated versions, but I cannot yet build these tools for myself due to this issue and a few others. I do have access to a Mac for cross-compilation if necessary, but I would rather use only my iPad because I like the challenge.

I can attach any logs necessary or provide more information if that would be useful; I do not know enough about this issue to know what information is useful. Thanks in advance for helping me!

Solution

For anyone who ends up needing to address this issue on a jailbreak that does not have a fix for this issue, I have written (pasted below) a userland hook based on the posix_spawn implementation from the source of Apple’s xnu kernel.

Compile it with Theos, and inject it into all processes spawned by your shell by setting environment variable DYLD_INSERT_LIBRARIES to the path of the resulting dylib. Note: some tweak injectors (namely libhooker, see here) reset DYLD_INSERT_LIBRARIES, so if you notice this behavior, be sure to inject only your library.

Because the implementation of the exec syscalls in iOS call out to posix_spawn, this hook fixes all of the exec-related issue’s I’ve run into so far.

#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <spawn.h>

// Copied from bsd/kern/kern_exec.c
#define IS_WHITESPACE(ch) ((ch == ' ') || (ch == '\t'))
#define IS_EOL(ch) ((ch == '#') || (ch == '\n'))
// Copied from bsd/sys/imgact.h
#define IMG_SHSIZE 512

// Here, we provide an alternate implementation of posix_spawn which correctly handles #!.
// This is based on the implementation of posix_spawn in bsd/kern/kern_exec.c from Apple's xnu source.
// Thus, I am fairly confident that this posix_spawn has correct behavior relative to macOS.
%hookf(int, posix_spawn, pid_t *pid, const char *orig_path, const posix_spawn_file_actions_t *file_actions, const posix_spawnattr_t *attrp, char *const orig_argv[], char *const envp[]) {
    
    // Call orig before checking for anything.
    // This mirrors the standard implementation of posix_spawn because it first checks if we are spawning a binary.
    int err = %orig;
    
    // %orig returns EPERM when spawning a script.
    // Thus, if err is anything other than EPERM, we can just return like normal.
    if (err != EPERM) 
        return err;
    
    // At this point, we do not need to check for exec permissions or anything like that.
    //  because posix_spawn would have returned that error instead of EPERM.

    // Now we open the file for reading so that we can check if it's a script.
    // If it turns out not to be a script, the EPERM must be from something else
    //  so we just return err.

    FILE *file = fopen(orig_path, "r");
    if (file == NULL) {
        return err;
    }
    if (fseek(file, 0, SEEK_SET)) {
        return err;
    }   

    // In exec_activate_image, the data buffer is filled with the first PAGE_SIZE bytes of the file.
    // However, in exec_shell_imgact, only the first IMG_SHSIZE bytes are used.
    // Thus, we read IMG_SHSIZE bytes out of our file.
    // The buffer is filled with newlines so that if the file is not IMG_SHSIZE bytes,
    //  the logic reads an IS_EOL.
    char vdata[IMG_SHSIZE] = {'\n'};
    if (fread(vdata, 1, IMG_SHSIZE, file) < 2) { // If we couldn't read at least two bytes, it's not a script.
        fclose(file);
        return err;
    }

    // Now that we've filled the buffer, we don't need the file anymore.
    fclose(file);
    
    // Now we follow exec_shell_imgact.
    // The point of this is to confirm we have a script 
    //  and extract the usable part of the interpreter+arg string.
    // Where they return -1, we don't have a shell script, so we return err.
    // Where they return an error, we return that same error.
    // We don't bother doing any SUID stuff because SUID scripts should be disabled anyway.
        char *ihp;
        char *line_startp, *line_endp;
        
    // Make sure we have a shell script.
        if (vdata[0] != '#' || vdata[1] != '!') {
        return err;
    }
        
        // Try to find the first non-whitespace character
        for (ihp = &vdata[2]; ihp < &vdata[IMG_SHSIZE]; ihp++) {
                if (IS_EOL(*ihp)) {
                        // Did not find interpreter, "#!\n"
                        return ENOEXEC;
                } else if (IS_WHITESPACE(*ihp)) {
                        // Whitespace, like "#!    /bin/sh\n", keep going.
                } else {
                        // Found start of interpreter
                        break;
                }
        }
        
        if (ihp == &vdata[IMG_SHSIZE]) {
                // All whitespace, like "#!           "
                return ENOEXEC;
        }

        line_startp = ihp;
        
        // Try to find the end of the interpreter+args string
        for (; ihp < &vdata[IMG_SHSIZE]; ihp++) {
                if (IS_EOL(*ihp)) {
                        // Got it
                        break;
                } else {
                        // Still part of interpreter or args
                }
        }
        
        if (ihp == &vdata[IMG_SHSIZE]) {
                // A long line, like "#! blah blah blah" without end
                return ENOEXEC;
        }
        
        // Backtrack until we find the last non-whitespace
        while (IS_EOL(*ihp) || IS_WHITESPACE(*ihp)) {
                ihp--;
        }
        
        // The character after the last non-whitespace is our logical end of line
        line_endp = ihp + 1;
        
        /*
         * Now we have pointers to the usable part of:
         *
         * "#!  /usr/bin/int first    second   third    \n"
         *      ^ line_startp                       ^ line_endp
         */

    // Now, exec_shell_imgact copies the interpreter into another buffer and then null-terminates it.
    // Then, it copies the entire interpreter+args into another buffer and null-terminates it for later processing into argv.
    // This processing is done in exec_extract_strings, which goes through and null-terminates each argument.
    // We will just do this all at once since that's much easier.
    
    // Keep track of how many arguments we have.
    int i_argc = 0;

    ihp = line_startp;
    while (true) {
        // ihp is on the start of an argument.
        i_argc++;
        // Scan to the end of the argument.
        for (; ihp < line_endp; ihp++) {
            if (IS_WHITESPACE(*ihp)) {
                // Found the end of the argument
                break;
            } else {
                // Keep going
            } 
        }
        // Null terminate the argument
        *ihp = '\0';
        // Scan to the beginning of the next argument.
        for (; ihp < line_endp; ihp++) {
            if (!IS_WHITESPACE(*ihp)) {
                // Found the next argument
                break;
            } else {
                // Keep going
            }
        }
        if (ihp == line_endp) {
            // We've reached the end of the arg string
            break;
        }
        // If we are here, ihp is the start of an argument.
    }
    // Now line_startp is a bunch of null-terminated arguments possibly padded by whitespace.
    // i_argc is now the count of the interpreter arguments.
       
    // Our new argv should look like i_argv[0], i_argv[1], i_argv[2], ..., orig_path, orig_argv[1], orig_argv[2], ..., NULL
    //  where i_argv is the arguments to be extracted from line_startp;
    // To allocate our new argv, we need to know orig_argc.
    int orig_argc = 0;
    while (orig_argv[orig_argc] != NULL) {
        orig_argc++;
    }
    
    // We need space for i_argc + 1 + (orig_argc - 1) + 1 char*'s
    char *argv[i_argc + orig_argc + 1];
    
    // Copy i_argv into argv
    int i = 0;
    ihp = line_startp;
    for (; i < i_argc; i++) {
        // ihp is on the start of an argument
        argv[i] = ihp;
        // Scan to the next null-terminator
        for (; ihp < line_endp; ihp++) {
            if (*ihp == '\0') {
                // Found it
                break;
            } else {
                // Keep going
            }
        }
        // Go to the next character
        ihp++;
        // Then scan to the next argument. 
        // There must be another argument because we already counted i_argc.
        for (; ihp < line_endp; ihp++) {
            if (!IS_WHITESPACE(*ihp)) {
                // Found it
                break;
            } else {
                // Keep going
            }
        }
        // ihp is on the start of an argument.
    }

    // Then, copy orig_path into into argv.
    // We need to make a copy of orig_path to avoid issues with const.
    char orig_path_copy[strlen(orig_path)+1];
    strcpy(orig_path_copy, orig_path);  
    argv[i] = orig_path_copy;
    i++;

    // Now, copy orig_argv[1...] into argv.
    for (int j = 1; j < orig_argc; i++, j++) {
        argv[i] = orig_argv[j];
    }
    // Finally, add the null.
    argv[i] = NULL;
    // Now, our argv is setup correctly. 

    // Now, we can call out to posix_spawn again.
    // The interpeter is in argv[0], so we use that for the path.
    return %orig(pid, argv[0], file_actions, attrp, argv, envp);
}