Search code examples
gccarmelfbinutils

Extract read-only data sections from an archive/lib (ELF i guess?) for compression


UPDATE: So the question is as follows: My build setup generates an archive/lib (binary output), which I would like to extract some data from, for compression in my case, but that really is not the point.

My instincts tells me, that since a linker can extract constant data from an archive/lib, It should be possible/easy for me to "dump" the binary "contents" of a symbol contained in an archive/lib to e.g. a file....

So that's my question: How to dump binary contents of a symbol in an archive/lib (in ELF format)?

UPDATE: I am building an application based on lvgl. To allow texts I'm using the online tool provided by the lvgl maintainer to convert TrueType fonts into C-code (const data) which is linked into the application for rendering texts. But the resulting data-set for the fonts is getting too large for my available flash memory, but I have a big chunk of un-used RAM. So, I would like to use heatshrink to compress the data, and un-compress to RAM at runtime.

This requires, that my build setup can extract the binary data, compress it, and link to flash, so that my run-time code can de-compress it.

I guessed, that I could stuff all the generated font-data into a "lib", extract the binary data, compress it, and link as a "blob" into the application.

But I'm failing to extract the data to compress from the library

E.g. my font-data declaration looks as follows:

/*Store the image of the letters (glyph)*/
static const uint8_t _glyph_bitmap[] = 
{ /* const Byte values follow (e.g. 0x00) */ };

static const lv_font_glyph_dsc_t _glyph_dsc[] = 
{ /* struct initialization follows */ }

lv_font_t myfont
{ /* struct initialization follows */ }

So I would need to access the myfont, and the declarations it is referencing in binary form.

I have a tool, which creates c-code representing some binary data, to allow the data to be compiled & linked into a final executable (ARM platform, GNU toolchain, custom hardware). I am running out of flash, but have RAM to spare. So i'm considering compressing some large constant-data sections in a library, and decompress these to RAM as needed. So I can compile the c-code, and stuff that into an archive. But so far i've had no luck trying to extract the binary data of the constant-data for compression using e.g. objdump or objcopy. But something tells me, that this is possible (and maybe easy even). But how? I've tried to "google" the problem, but came up empty-handed.


Solution

  • Eureka! I figured it out!

    While I fully acknowledge and appreciate the advice given in comments / other answers, it still bothered me that I speculated it should be relatively easy to "play the linker" and extract blobs of hardcoded data from e.g an object-file.

    Well, it turns out to be relatively easy (for elf format objects anyways) just as expected by using readelf

    To dump a symbol, I used two steps:

    1. Figure out the symbol index, by looking at the symbols in the object:

      $ readelf --syms company_logo.o

      Symbol table '.symtab' contains 17 entries:

      Num: Value Size Type Bind Vis Ndx Name

      0: 00000000 0 NOTYPE LOCAL DEFAULT UND

      1: 00000000 0 FILE LOCAL DEFAULT ABS company_logo.c

      2: 00000000 0 SECTION LOCAL DEFAULT 1

      3: 00000000 0 SECTION LOCAL DEFAULT 2

      4: 00000000 0 SECTION LOCAL DEFAULT 3

      5: 00000000 0 SECTION LOCAL DEFAULT 4

      6: 00000000 0 NOTYPE LOCAL DEFAULT 4 $d

      7: 00000000 0 SECTION LOCAL DEFAULT 6

      8: 00000000 0 SECTION LOCAL DEFAULT 7

      9: 00000000 0 SECTION LOCAL DEFAULT 9

      10: 00000000 0 SECTION LOCAL DEFAULT 10

      11: 00000000 0 SECTION LOCAL DEFAULT 12

      12: 00000000 0 SECTION LOCAL DEFAULT 13

      13: 00000000 0 SECTION LOCAL DEFAULT 14

      14: 00000000 0 SECTION LOCAL DEFAULT 15

      15: 00000000 12 OBJECT GLOBAL DEFAULT 4 company_logo

      16: 00000000 21879 OBJECT GLOBAL DEFAULT 6 company_logo_map

    2. Dump the contents of the symbol.

    Now company_logo_mapwas my target, so use its index 6, as follows:

    `readelf --hex-dump=6 company_logo.o`
    
    ` `
    
    `Hex dump of section '.rodata.company_logo_map':`
    
    `  0x00000000 00000000 00000000 00000000 00000000 ................`
    
    `  0x00000010 00000000 00000000 00000000 00000000 ................`
    
    `  0x00000020 00000000 00000000 00000000 00000000 ................`
    
    `  0x00000030 00000000 00000000 00000000 00000000 ................`
    
    `  ... lots more data here`