I have two assembly codes like the one below
file: a.asm
section .text
global _start
_start: mov eax, 4
mov ebx, 1
mov ecx, mesg
mov edx, 10
int 0x80
mov eax, 1
int 0x80
mesg db "KingKong",0xa
and another assembly code
file: b.asm
section .text
global _start
_start: jmp mesg
prgm: mov eax, 4
mov ebx, 1
pop ecx
mov edx, 10
int 0x80
mov eax,1
int 0x80
mesg: call prgm
db "KingKong",0xa
After taking the hex of these two codes and putting it inside this C wrapper
char *b = "\xb8\x04\x00\x00\x00\xbb\x01\x00\x00\x00\xb9\x7d\x80\x04\x08\xba\x0a\x00\x00\x00\xcd\x80\xb8\x01\x00\x00\x00\xcd\x80\x4b\x69\x6e\x67\x4b\x6f\x6e\x67\x0a";
char *b = "\xe9\x19\x00\x00\x00\xb8\x04\x00\x00\x00\xbb\x01\x00\x00\x00\x59\xba\x0a\x00\x00\x00\xcd\x80\xb8\x01\x00\x00\x00\xcd\x80\xe8\xe2\xff\xff\xff\x4b\x69\x6e\x67\x4b\x6f\x6e\x67\x0a";
int main()
{
(*(int (*)(void))a)();
}
The first assembly code(b
) prints 'KingKong' as expected but the second assembly code a
print garbage. like the one shown
root@bt:~/Arena# ./a
�root@bt:~/Arena#
root@bt:~/Arena# ./b
KingKong
output generated by a
(first one) is this �
weird character, while the second one (b
) prints kingkong
as expected.
Now could someone exlain why was the second assembly code is working while the first isn't .
EDIT:
From the answer I see that the first program hard-codes the address. Even the second approach uses labels, like jmp mesg
, now wont this instruction make the program very similar to the first, aren't they both the same cos they use labels
to decide the location. All I know is, to have the code position independent we need to use the esp
or the ebp
registers with relative addressing scheme. Wont the second program's jmp instruction make it just the same like the first one.
The address of mesg
can vary depending on how your program is laid out in memory.
The following will hard-code a specific address and will not work reliably (or at all):
mov ecx, mesg
For reference, the first approach hard-codes the following address:
mov ecx, 0x804807d
The second approach does work because it figures out the address of mesg
at runtime, using the return address of a call
instruction.
Put another way, the first version only works if loaded at a specific address whereas the second is position-independent.
It is worth noting that the jmp
and the call
instructions that appear in the second version use relative addressing, meaning that the opcodes specify the distance to the target rather than the address of the target. This makes these instructions work regardless of where they are placed in memory.
If you examine the opcodes, you'll see that the jmp
is encoded as
e9 19 00 00 00
(i.e. jump 0x19, or 2510, bytes forward), and the call
is encoded as
e8 e2 ff ff ff
where 0xffffffe2
is a small negative number (-30).