I'm trying to come up with a slick way of generating a symbol table from my compiled binary.
I'm generally working in embedded with a fully featured GNU toolchain, though I am open to using system utilities (preferably Windows/MSYS2/Cygwin) to assist. My scripting language of choice is python as this is the language generally used within the company for which I work.
For reference, the following post from ~4 years ago is almost exactly what I am looking for, and I was hoping that given a significant amount of time has passed, there has to be a simpler way to achieve this.
Extract detailed symbol information (struct members) from elf file compiled with ARM-GCC
I'm quite familiar with gdb and am used to using info variables
, p &name
, ptype name
, etc. What I
ultimately need is an input/output that looks something like below. I'll need to support all structs, unions,
enums and deep nesting of types as well (structs within structs within structs). I'm ok with stripping off all
other decorations like static, volatile, atomic, etc. I'm not sure yet what I want to do with pointers, but
I suppose it'd be nice to append an asterisk to the type in the CSV output below.
Sample Code
uint64_t myU64;
int64_t my64;
typedef struct {
uint8_t aaa;
int8_t bbb;
} myStruct2_t;
struct {
uint32_t a;
int32_t b;
float c;
enum {
E_ONE = 100,
E_TWO = 200,
E_THREE = 300
} myEnum;
union {
uint16_t aa;
int16_t bb;
} myUnion;
myStruct2_t myStruct2[3];
uint32_t myArr[2];
} myStruct;
Desired Output
myU64, 0x8001918, uint64_t
my64, 0x800191C, int64_t
myStruct.a, 0x8001920, uint32_t
myStruct.b, 0x8001924, int32_t
myStruct.c, 0x8001928, float
myStruct.myEnum, 0x800192C, int16_t <-- Requires deeper digging for enum
myStruct.myUnion.aa, 0x800192E, uint16_t
myStruct.myUnion.bb, 0x800192E, int16_t
myStruct.myStruct2[0].aaa, 0x8001930, uint8_t
myStruct.myStruct2[0].bbb, 0x8001931, int8_t
myStruct.myStruct2[1].aaa, 0x8001932, uint8_t
myStruct.myStruct2[1].bbb, 0x8001933, int8_t
myStruct.myStruct2[2].aaa, 0x8001934, uint8_t
myStruct.myStruct2[2].bbb, 0x8001935, int8_t
myStruct.myArr[0], 0x8001938, uint32_t
myStruct.myArr[1], 0x800193C, uint32_t
Using the gdb command examples I listed above, I can get all this information, but it would require me to write an extremely sophisticated string parser. Any ideas? Tools that exist or an easy way to automate this? I'm ok with having to create a tool, but so far my ideas require a string parsing monstrosity. I've looked briefly into the python/gdb API, but haven't seen examples that are very applicable, but maybe that is a route I could take too.
Also, while my focus has been to use gdb, I'm open to any other tool that can assist.
Thanks!
slick way of generating a symbol table from my compiled binary.
Your compiled binary already has a symbol table, and what you are trying to generate has nothing to do with what is normally a symbol table, creating unnecessary confusion.
What you are looking for is a description of debug info in non-standard format (the standard format is DWARF, which is what GDB reads to produce output from ptype
).
To read DWARF debug info programmatically, use libdwarf.