I have written the following code, can you explain me what does the assembly tell here.
typedef struct
{
int abcd[5];
} hh;
void main()
{
printf("%d", ((hh*)0)+1);
}
Assembly:
.file "aa.c"
.section ".rodata"
.align 8
.LLC0:
.asciz "%d\n"
.section ".text"
.align 4
.global main
.type main, #function
.proc 020
main:
save %sp, -112, %sp
sethi %hi(.LLC0), %g1
or %g1, %lo(.LLC0), %o0
mov 20, %o1
call printf, 0
nop
return %i7+8
nop
.size main, .-main
.ident "GCC: (GNU) 4.2.1"
Oh wow, SPARC assembly language, I haven't seen that in years.
I guess we go line by line? I'm going to skip some of the uninteresting boilerplate.
.section ".rodata"
.align 8
.LLC0:
.asciz "%d\n"
This is the string constant you used in printf
(so obvious, I know!) The important things to notice are that it's in the .rodata
section (sections are divisions of the eventual executable image; this one is for "read-only data" and will in fact be immutable at runtime) and that it's been given the label .LLC0
. Labels that begin with a dot are private to the object file. Later, the compiler will refer to that label when it wants to load the address of the string constant.
.section ".text"
.align 4
.global main
.type main, #function
.proc 020
main:
.text
is the section for actual machine code. This is the boilerplate header for defining the global function named main
, which at the assembly level is no different from any other function (in C -- not necessarily so in C++). I don't remember what .proc 020
does.
save %sp, -112, %sp
Save the previous register window and adjust the stack pointer downward. If you don't know what a register window is, you need to read the architecture manual: http://sparc.org/wp-content/uploads/2014/01/v8.pdf.gz. (V8 is the last 32-bit iteration of SPARC, V9 is the first 64-bit one. This appears to be 32-bit code.)
sethi %hi(.LLC0), %g1
or %g1, %lo(.LLC0), %o0
This two-instruction sequence has the net effect of loading the address .LLC0
(that's your string constant) into register %o0
, which is the first outgoing argument register. (The arguments to this function are in the incoming argument registers.)
mov 20, %o1
Load the immediate constant 20 into %o1
, the second outgoing argument register. This is the value computed by ((foo *)0)+1
. It's 20 because your struct foo
is 20 bytes long (five 4-byte int
s) and you asked for the second one within the array starting at address zero.
Incidentally, computing an offset from a pointer is only well-defined in C when there is actually a sufficiently large array at the address of the base pointer; ((foo *)0)
is a null pointer, so there isn't an array there, so the expression ((foo *)0)+1
technically has undefined behavior. GCC 4.2.1, targeting hosted SPARC, happens to have interpreted it as "pretend there is an arbitrarily large array of foo
s at address zero and compute the expected offset for array member 1", but other (especially newer) compilers may do something completely different.
call printf, 0
nop
Call printf
. I don't remember what the zero is for. The call
instruction has a delay slot (again, read the architecture manual) which is filled in with a do-nothing instruction, nop
.
return %i7+8
nop
Jump to the address in register %i7
plus eight. This has the effect of returning from the current function.
return
also has a delay slot, which is filled in with another nop
. There is supposed to be a restore
instruction in this delay slot, matching the save
at the top of the function, so that main
's caller gets its register window back. I don't know why it's not there. Discussion in the comments talks about main
possibly not needing to pop the register window, and/or your having declared main
as void main()
(which is not guaranteed to work with any C implementation, unless its documentation specifically says so, and is always bad style) ... but pushing and not popping the register window is such a troublesome thing to do on a SPARC that I don't find either explanation convincing. I might even call it a compiler bug.