Search code examples
iosbuildgnu-makejailbreakllvm-clang

Why do builds for various projects fail with ‘Operation not permitted’ using iOS on-device compiler/toolchain?


I am an intermediately skilled Linux/Unix user trying to compile software for an iPad on a (jailbroken) iPad.

Many builds (for example, make and tex-live) fail with some Operation not permitted error. This will either look like Can't exec "blah": Operation not permitted or execvp: blah: Operation not permitted where blah is aclocal, a configure script, libtool, or just about anything. Curiously, finding the offending line in a Makefile or configure script and prefixing it with sudo -u mobile -E will solve the error for that line, only for it to reappear for on a later line or in another file. Since I am running the build scripts as mobile, I do not understand how this could possibly fix the issue, yet it does. I have confirmed that making these changes does actually allow for the script to work successfully up to that point. Running the build script with sudo or sudo -u mobile -E and/or running the entire build as root does not solve the issue; with either, I still must edit build scripts to add sudo’s.

I would like to know why this is happening, and if possible how I could address the issue without editing build scripts. Any information about these types of errors would be interesting to me even if they do not solve my problem. I am aware that the permissions/security/entitlements system is unusual on iOS and would like to learn more about how it works.

I am using an iPad Pro 4 on jailbroken iOS 13.5 with the build tools from sbingner’s and MCApollo’s repos (repo.bingner.com and mcapollo.github.io/Public). In particular, I am using a build of LLVM 5 (manually installed from sbingner’s old debs), Clang 10, Darwin CC tools 927 and GNU Make 4.2.1. I have set CC, CXX, CFLAGS, etc. to point to clang-10 and my iOS 13.5 SDK with -isysroot and have confirmed that these settings are working. I would like to replace these with updated versions, but I cannot yet build these tools for myself due to this issue and a few others. I do have access to a Mac for cross-compilation if necessary, but I would rather use only my iPad because I like the challenge.

I can attach any logs necessary or provide more information if that would be useful; I do not know enough about this issue to know what information is useful. Thanks in advance for helping me!


Solution

  • For anyone who ends up needing to address this issue on a jailbreak that does not have a fix for this issue, I have written (pasted below) a userland hook based on the posix_spawn implementation from the source of Apple’s xnu kernel.

    Compile it with Theos, and inject it into all processes spawned by your shell by setting environment variable DYLD_INSERT_LIBRARIES to the path of the resulting dylib. Note: some tweak injectors (namely libhooker, see here) reset DYLD_INSERT_LIBRARIES, so if you notice this behavior, be sure to inject only your library.

    Because the implementation of the exec syscalls in iOS call out to posix_spawn, this hook fixes all of the exec-related issue’s I’ve run into so far.

    #include <stdio.h>
    #include <string.h>
    #include <errno.h>
    #include <spawn.h>
    
    // Copied from bsd/kern/kern_exec.c
    #define IS_WHITESPACE(ch) ((ch == ' ') || (ch == '\t'))
    #define IS_EOL(ch) ((ch == '#') || (ch == '\n'))
    // Copied from bsd/sys/imgact.h
    #define IMG_SHSIZE 512
    
    // Here, we provide an alternate implementation of posix_spawn which correctly handles #!.
    // This is based on the implementation of posix_spawn in bsd/kern/kern_exec.c from Apple's xnu source.
    // Thus, I am fairly confident that this posix_spawn has correct behavior relative to macOS.
    %hookf(int, posix_spawn, pid_t *pid, const char *orig_path, const posix_spawn_file_actions_t *file_actions, const posix_spawnattr_t *attrp, char *const orig_argv[], char *const envp[]) {
        
        // Call orig before checking for anything.
        // This mirrors the standard implementation of posix_spawn because it first checks if we are spawning a binary.
        int err = %orig;
        
        // %orig returns EPERM when spawning a script.
        // Thus, if err is anything other than EPERM, we can just return like normal.
        if (err != EPERM) 
            return err;
        
        // At this point, we do not need to check for exec permissions or anything like that.
        //  because posix_spawn would have returned that error instead of EPERM.
    
        // Now we open the file for reading so that we can check if it's a script.
        // If it turns out not to be a script, the EPERM must be from something else
        //  so we just return err.
    
        FILE *file = fopen(orig_path, "r");
        if (file == NULL) {
            return err;
        }
        if (fseek(file, 0, SEEK_SET)) {
            return err;
        }   
    
        // In exec_activate_image, the data buffer is filled with the first PAGE_SIZE bytes of the file.
        // However, in exec_shell_imgact, only the first IMG_SHSIZE bytes are used.
        // Thus, we read IMG_SHSIZE bytes out of our file.
        // The buffer is filled with newlines so that if the file is not IMG_SHSIZE bytes,
        //  the logic reads an IS_EOL.
        char vdata[IMG_SHSIZE] = {'\n'};
        if (fread(vdata, 1, IMG_SHSIZE, file) < 2) { // If we couldn't read at least two bytes, it's not a script.
            fclose(file);
            return err;
        }
    
        // Now that we've filled the buffer, we don't need the file anymore.
        fclose(file);
        
        // Now we follow exec_shell_imgact.
        // The point of this is to confirm we have a script 
        //  and extract the usable part of the interpreter+arg string.
        // Where they return -1, we don't have a shell script, so we return err.
        // Where they return an error, we return that same error.
        // We don't bother doing any SUID stuff because SUID scripts should be disabled anyway.
            char *ihp;
            char *line_startp, *line_endp;
            
        // Make sure we have a shell script.
            if (vdata[0] != '#' || vdata[1] != '!') {
            return err;
        }
            
            // Try to find the first non-whitespace character
            for (ihp = &vdata[2]; ihp < &vdata[IMG_SHSIZE]; ihp++) {
                    if (IS_EOL(*ihp)) {
                            // Did not find interpreter, "#!\n"
                            return ENOEXEC;
                    } else if (IS_WHITESPACE(*ihp)) {
                            // Whitespace, like "#!    /bin/sh\n", keep going.
                    } else {
                            // Found start of interpreter
                            break;
                    }
            }
            
            if (ihp == &vdata[IMG_SHSIZE]) {
                    // All whitespace, like "#!           "
                    return ENOEXEC;
            }
    
            line_startp = ihp;
            
            // Try to find the end of the interpreter+args string
            for (; ihp < &vdata[IMG_SHSIZE]; ihp++) {
                    if (IS_EOL(*ihp)) {
                            // Got it
                            break;
                    } else {
                            // Still part of interpreter or args
                    }
            }
            
            if (ihp == &vdata[IMG_SHSIZE]) {
                    // A long line, like "#! blah blah blah" without end
                    return ENOEXEC;
            }
            
            // Backtrack until we find the last non-whitespace
            while (IS_EOL(*ihp) || IS_WHITESPACE(*ihp)) {
                    ihp--;
            }
            
            // The character after the last non-whitespace is our logical end of line
            line_endp = ihp + 1;
            
            /*
             * Now we have pointers to the usable part of:
             *
             * "#!  /usr/bin/int first    second   third    \n"
             *      ^ line_startp                       ^ line_endp
             */
    
        // Now, exec_shell_imgact copies the interpreter into another buffer and then null-terminates it.
        // Then, it copies the entire interpreter+args into another buffer and null-terminates it for later processing into argv.
        // This processing is done in exec_extract_strings, which goes through and null-terminates each argument.
        // We will just do this all at once since that's much easier.
        
        // Keep track of how many arguments we have.
        int i_argc = 0;
    
        ihp = line_startp;
        while (true) {
            // ihp is on the start of an argument.
            i_argc++;
            // Scan to the end of the argument.
            for (; ihp < line_endp; ihp++) {
                if (IS_WHITESPACE(*ihp)) {
                    // Found the end of the argument
                    break;
                } else {
                    // Keep going
                } 
            }
            // Null terminate the argument
            *ihp = '\0';
            // Scan to the beginning of the next argument.
            for (; ihp < line_endp; ihp++) {
                if (!IS_WHITESPACE(*ihp)) {
                    // Found the next argument
                    break;
                } else {
                    // Keep going
                }
            }
            if (ihp == line_endp) {
                // We've reached the end of the arg string
                break;
            }
            // If we are here, ihp is the start of an argument.
        }
        // Now line_startp is a bunch of null-terminated arguments possibly padded by whitespace.
        // i_argc is now the count of the interpreter arguments.
           
        // Our new argv should look like i_argv[0], i_argv[1], i_argv[2], ..., orig_path, orig_argv[1], orig_argv[2], ..., NULL
        //  where i_argv is the arguments to be extracted from line_startp;
        // To allocate our new argv, we need to know orig_argc.
        int orig_argc = 0;
        while (orig_argv[orig_argc] != NULL) {
            orig_argc++;
        }
        
        // We need space for i_argc + 1 + (orig_argc - 1) + 1 char*'s
        char *argv[i_argc + orig_argc + 1];
        
        // Copy i_argv into argv
        int i = 0;
        ihp = line_startp;
        for (; i < i_argc; i++) {
            // ihp is on the start of an argument
            argv[i] = ihp;
            // Scan to the next null-terminator
            for (; ihp < line_endp; ihp++) {
                if (*ihp == '\0') {
                    // Found it
                    break;
                } else {
                    // Keep going
                }
            }
            // Go to the next character
            ihp++;
            // Then scan to the next argument. 
            // There must be another argument because we already counted i_argc.
            for (; ihp < line_endp; ihp++) {
                if (!IS_WHITESPACE(*ihp)) {
                    // Found it
                    break;
                } else {
                    // Keep going
                }
            }
            // ihp is on the start of an argument.
        }
    
        // Then, copy orig_path into into argv.
        // We need to make a copy of orig_path to avoid issues with const.
        char orig_path_copy[strlen(orig_path)+1];
        strcpy(orig_path_copy, orig_path);  
        argv[i] = orig_path_copy;
        i++;
    
        // Now, copy orig_argv[1...] into argv.
        for (int j = 1; j < orig_argc; i++, j++) {
            argv[i] = orig_argv[j];
        }
        // Finally, add the null.
        argv[i] = NULL;
        // Now, our argv is setup correctly. 
    
        // Now, we can call out to posix_spawn again.
        // The interpeter is in argv[0], so we use that for the path.
        return %orig(pid, argv[0], file_actions, attrp, argv, envp);
    }