Assume I have a compiled binary program, without debug symbols, with a source code similar to this in C:
char code[] = "1234";
if (atoi(code) == 4321)
{
puts("Right");
}
else
{
puts("Wrong");
}
I debug the program with GDB.
Is there a way to find out the value of the integer defined as 4321 while I step through the program?
In the general case no, e.g. if (x < 10)
might get compiled to cmp reg, 9
/ jle
for x86 (x <= 9
). Of course that's the same logic, but you can't recover whether the source used 9 or 10 from the immediate operand in the machine code.
Or constant-propagation or value-range analysis might have make the condition known-true at compile time so no immediate value appears in the asm at all. e.g. a smart compiler that knows what atoi
does could compile this C the same as puts("Right");
.
In this specific case, it turns out GCC / clang don't bother looking for that optimization; atoi
on compile-time-constant strings isn't something that normal programs do often so it's not worth the compile-time to make strtol
a built-in function that would support constant-propagation.
In this case the number does appear fairly obviously as an immediate constant, when GCC or clang compile it for x86-64 GNU/Linux (Godbolt), with your code (including the array declaration) inside a function. Not as a global; that would make constant-propagation impossible.
This is the compiler's asm output; it hasn't had a round trip to machine code and back like you'd see inside gdb
or other debuggers, but that would preserve everything except the symbol names.
foo: # @foo
push rax # reserve 8 bytes for the local array
mov byte ptr [rsp + 4], 0 # Silly compiler, could have made the next instruction a qword store of a sign-extended 32-bit immediate to get the 0-termination for free.
mov dword ptr [rsp], 875770417 # the ASCII bytes of the "4321" array initializer
mov rdi, rsp
xor esi, esi
mov edx, 10
call strtol # atoi compiled to strtol(code, NULL, 10)
cmp eax, 4321 # compare retval with immediate constant
mov eax, offset .L.str
mov edi, offset .L.str.1
cmove rdi, rax # select which string literal to pass to puts, based on FLAGS from the compare
pop rax # clear up the stack
jmp puts # TAILCALL
.section .rodata # Godbolt filters directives, re-added this one.
.L.str:
.asciz "Right"
.L.str.1:
.asciz "Wrong"
Note that RISC ISAs sometimes have to construct constants using 2 instructions, or (common on 32-bit ARM), load them from memory with a PC-relative load. So you won't always find constants as a single number.