Why assembly produced by objdump is huge?

I am trying to view the assembly for my simple C application. So, I have tried to produce assembly from binary by using objdump and it produces about 4.3MB sized file with 103228 lines of assembly code. Then, I have tried to do so by providing -S & -save-temps flags to the gcc.

I have used the following three commands:

 1. arm-linux-gnueabi-objdump -d hello_simple > hello_simple.dump
 2. arm-linux-gnueabi-gcc -save-temps -static hello_simple.c -o hello_simple -lm
 3. arm-linux-gnueabi-gcc -S -static hello_simple.c -o hello_simple.asm -lm

In case of 2 & 3, exactly same results are produced, i.e., 65 lines of assembly code. I understand objdump produces some extra details too.

But, why is there a huge difference?

EDIT1: I have used the following command to build that binary:

arm-linux-gnueabi-gcc -static hello_simple.c -o hello_simple -lm

EDIT2: Though, -static and -lm flags may look here unnecessary but, I have to execute this binary on simulator after compile time additions of some assembly components, making them a must.

So, which assembly code should I consider as the most relevant during my analysis of execution traces? (I know it's another question but it would be handy to cover it in the same answer.)

Solution

The second two are just saving the asm for your functions.

The first one also has the CRT startup code. And, since you statically linked it, all the library functions you called.

Note that for 3, -static and -lm don't do anything, because you're not linking. gcc foo.c -S -O3 -fverbose-asm -o- | less is often handy.

I notice that none of your command lines included a -O3, or a -march=. You should compile with optimization on, and have gcc optimize your code for the target hardware.

.s is the standard suffix for machine-generated asm. (.S for hand-written asm: gcc foo.S will run it through cpp first). gcc -S produces a .s, the same way -c produces a .o.

For x86, .asm is usually only used for Intel-syntax (NASM/YASM), but IDK what the conventions are for ARM.

So, which assembly code should I consider as the most relevant during my analysis of execution traces?

It depends what you're trying to learn! If you have a good sense of how "expensive" each library function call is (in terms of number of instructions, number of branches polluting the branch-predictors, and data-cache pollution), then you don't need to trace execution through library calls. If you have math library functions that are used from some of your inner loops, then it's worth looking at them if the code is time-critical.

Usually a profiler or single-stepping in a debugger is useful for that, though. Just having disassembly output of a lot of library code is usually just clutter.