I was implementing some data structures for fun in C, and I was comparing the speed to other data structures I implemented. This one is a closed addressing hash table.
This is the code I used to create a node
hashNode *createNewNode(int data) {
hashNode *node = calloc(1, sizeof(hashNode));
node->data.key = data;
node->isSet = true;
node->next = NULL;
}
This is the function I wanted to time.
for (int i = 0; i < 5; i++) {
hashNode *node = createNewNode(arr[i]);
InsertNode(node, map);
}
(arr is just the first 5000 numbers shuffled) As you may have noticed the function to create a node doesn't have a return value, still despite this the node is initialized correctly and all the numbers that were supposed to be in the table were inserted, and only those. How could that happen?
This gimmik only works in VS Code, I tried running it in Visual Studio but it (correctly) doesn't initialize the node. Does anyone know what is happening?
Edit: okay, I probably didn't express myself correctly I'm sorry. My question is, how does it work? I know it's undefined behaviour, but it doesn't look like something that is supposed to work, yet it worked correctly 5000 times out of 5000 and even if I put some printf here and there it keeps working correctly
As @dbush's answer already explains, not returning a value in a non-void function but using the value of the function call is UB (undefined behavior).
From this C23 standard draft §6.9.2, point 13:
Unless otherwise specified, if the } that terminates the function body is reached, and the value of the function call is used by the caller, the behavior is undefined.
That calling your createNewNode()
function and using its return value (if you can call it that since it is not returning) worked out as if you had returned the allocated node is pure luck (or unluckiness, depending on how you see it).
When you encounter undefined behavior, you can't really rely on it or reason about it. If you change the compiler or just its version or some compiler flags like optimization level or anything in your code, even if it doesn't directly relate to the source of the UB, your code might behave differently each time.
Even though you probably shouldn't, I still tried to reason about what was going on in your code and @dbush already somewhat explained that, too.
the value of node could be sitting in a register
Using Compiler Explorer I compiled some simplified but similar code using x86-64 gcc 14.1 (live example) without setting any compiler flags.
#include <stdio.h>
#include <stdlib.h>
int* gPtr = NULL;
int* allocateInt() {
int* p = calloc(1, sizeof *p);
gPtr = p;
}
int main() {
int* ptr = allocateInt();
printf("ptr: %p\n", ptr);
printf("gPtr: %p\n", gPtr);
free(ptr);
return 0;
}
This is its generated assembly:
gPtr:
.zero 8
allocateInt:
push rbp
mov rbp, rsp
sub rsp, 16
mov esi, 4
mov edi, 1
call calloc
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-8]
mov QWORD PTR gPtr[rip], rax
nop
leave
ret
.LC0:
.string "ptr: %p\n"
.LC1:
.string "gPtr: %p\n"
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov eax, 0
call allocateInt
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-8]
mov rsi, rax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov rax, QWORD PTR gPtr[rip]
mov rsi, rax
mov edi, OFFSET FLAT:.LC1
mov eax, 0
call printf
mov rax, QWORD PTR [rbp-8]
mov rdi, rax
call free
mov eax, 0
leave
ret
Some of this assembly code might look scary but the for us important parts are not that hard to understand if you know what you have to look for.
Program execution in C starts from the main()
function, in the assembly marked with the main:
label. After setting up the stack frame we encounter call allocateInt
and you can probably guess by yourself what this does. In allocateInt()
we have a call calloc
.
The return value of calloc()
is stored in the 64-bit register rax
. x86-64 calling conventions:
Integer return values up to 64 bits in size are stored in
RAX
We now have the following three lines:
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-8]
mov QWORD PTR gPtr[rip], rax
The first one stores the value in the rax
register on the stack (in allocateInt()
's local p
). The next line stores the value in p
back into rax
. Then we store the value in rax
into our global gPtr
.
The next usage of rax
is directly after the call to allocateInt()
.
mov QWORD PTR [rbp-8], rax
You should already be able to tell what this line does. It stores the value in rax
on the stack. Since we are in main()
again and allocateInt()
's stack has already been teared down this now stores it in main()
's ptr
.
Therefore the program had the following output for me, which shows that allocateInt()
seemingly returned the correct value, even though we didn't write a return
statement ourselves:
ptr: 0x20d22a0
gPtr: 0x20d22a0
As I already said, modifying even a little bit can change the behavior of the program, since we have UB in our code. The following modifications I did show that you can't rely on UB.
When I turned up the optimization level to -O2
or -O3
, the output differed. However -O1
and -Os
still managed to return the pointer.
Originally I had a call to printf()
in the allocateInt()
function, but this caused ptr
not having the same value as p
. Moving that printf()
call into main()
and having a global gPtr
got me the result which I have shown here.
On some compilers my code caused a segfault.