I wrote the below program to iterate over all images in memory and dump their string tables.
#include <mach-o/dyld.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char** argv) {
uint32_t count = _dyld_image_count();
for (uint32_t i = 0 ; i < count ; i++) {
const char* imageName = _dyld_get_image_name(i);
printf("IMAGE[%u]=%s\n", i, imageName);
const struct mach_header* header = _dyld_get_image_header(i);
if (header->magic != MH_MAGIC_64)
continue;
struct mach_header_64* header64 = (struct mach_header_64*)header;
char *ptr = ((void*)header64) + sizeof(struct mach_header_64);
for (uint32_t j = 0; j < header64->ncmds; j++) {
struct load_command *lc = (struct load_command *)ptr;
ptr += lc->cmdsize;
if (lc->cmd != LC_SYMTAB)
continue;
struct symtab_command* symtab = (struct symtab_command*)lc;
printf("\t\tLC_SYMTAB.stroff=%u\n", symtab->stroff);
printf("\t\tLC_SYMTAB.strsize=%u\n", symtab->strsize);
if (symtab->strsize > 100*1024*1024) {
printf("\t\tHUH? Don't believe string table is over 100MiB in size!\n");
continue;
}
char *strtab = (((void*)header64) + symtab->stroff);
uint32_t off = 0;
while (off < symtab->strsize) {
char *e = &(strtab[off]);
if (e[0] != 0)
printf("\t\tSTR[%u]=\"%s\"\n", off, e);
off += strlen(e) + 1;
}
}
}
return 0;
}
It seems to randomly work for some images, but for others the stroff/strsize have nonsensical values:
LC_SYMTAB.stroff=1266154560
LC_SYMTAB.strsize=143767728
It seems to always be the same two magic values, but I'm not sure if this is system-dependent in some way or if other people will get the same specific values.
If I comment out the check for strsize being over 100MiB, then printing the string table segfaults.
Most images seem to have this problem, but some don't. When I run it, I get the issue for 29 images out of 38.
I can't observe any pattern as to which do and which won't. What is going on here?
If it is relevant, I am testing on macOS 10.14.6 and compiling with Apple LLVM version 10.0.1 (clang-1001.0.46.4).
As you already worked out, those are from the dyld_shared_cache
. And the 0x80000000
flag is indeed documented, in the headers shipped with Xcode or any semi-recent XNU source:
#define MH_DYLIB_IN_CACHE 0x80000000 /* Only for use on dylibs. When this bit
is set, the dylib is part of the dyld
shared cache, rather than loose in
the filesystem. */
As you've also discovered, the stroff
/strsize
values do not yield usable results when added to the dyld_shared_cache
base. That is because those are not memory offsets, but file offsets. This is true for all Mach-O's, it's just often the case that the segments of non-cached binaries have the same relative position in file and memory offsets. But this is definitely not true for the shared cache.
To translate the file offset into a memory address, you'll have to parse the segments in the shared cache header. You can find struct definitions in the dyld source.