Removing DWARF-2 duplicate symbols exhausts memory

I'm dealing with a large project that gets compiled as a shared object. Compiling it with DWARF-2 symbols (-g -feliminate-unused-debug-types) results in the .debug_info being around 700M.

If I add -feliminate-dwarf2-dups, the linker dies:

error adding symbols: Memory exhausted
ld returned 1 exit status

This is on a system with 4G RAM. Since this needs to be compiled on a wide range of systems, consuming over 4G RAM is not acceptable. I tried passing --no-keep-memory to ld, but it still fails.

ld normally optimizes for speed over memory usage by caching the symbol tables of input files in memory. This option tells ld to instead optimize for memory usage, by rereading the symbol tables as necessary. This may be required if ld runs out of memory space while linking a large executable.

I'm guessing ld loads all the symbols in memory then goes about finding dupes, which takes 5+ times the memory it takes to store them on disk.

Is there a simple way to incrementally do this? Like:

Load the symbols from first .o file
Load the symbols from next .o file
Merge them, removing any duplicates
goto 2.

I could link files two by two into temporary archives, then link those two by two, etc, but I really don't want to change the build process for this project. Maybe I could use objcopy to remove those segments, perform the dupe elimination separately, then insert the debug sections into the final ELF?

Is there any other tool that can perform these DWARF merges? dwarfdump only reads the files. Alternatively, can I invoke gcc/ld to just do this, instead of actually linking the files?

Solution

There are two ways that I know of to reduce DWARF size. Which one you want depends on your intended purpose.

Fedora (and maybe other distros, I don't know) uses the dwz tool to compress DWARF. This works after the fact: you link your program or shared library, then run dwz. It is a "semantic" compressor, meaning it understands the DWARF and rewrites it into a smaller form. In DWARF terms it makes partial CUs and shares data that way. dwz also has a mode where it can compress data across different executables for more sharing.

dwz yields best compression. The major downside is that it doesn't fit very well into a developer workflow -- it is a bit slow, uses a a lot of memory, etc. It's great for distros though, and I think would be appropriate for some other deployment situations.

The other decent way to compress debuginfo is to use the -fdebug-types-section flag to gcc. This changes the DWARF output to put large types into their own sections. The types are hashed by their contents; then the linker merges these sections automatically.

This approach yields decent compression, because types are a substantial part of the DWARF; with decent performance, because merging identical sections this way in the linker is cheap. The major downside is that the compression isn't as good.

gdb understands both of these kinds of compression. Support in other tools is a bit more spotty.