I'm currently learning C and also some datastructures such as binary search trees etc. I have trouble understanding HOW exactly changing pointer values within a function works in some cases and in others doesn't... I'll attach some of my code I wrote. It's an insert function which inserts values in the correct places in the BST (it works as it should). I tried working with pointers to pointers to be able to change values withing a function. Even though it works, im still really confused why it actually does. I don't quite understand why my insert function actually changes the BST even though I only work with local variables (tmp, parent_ptr) in my insert function and I don't really dereference any pointers apart from " tmp = *p2r " in the insert function.
Thanks for helping out.
#include <stdio.h>
#include <stdlib.h>
struct TreeNode{
int val;
struct TreeNode *left;
struct TreeNode *right;
};
struct TreeNode** createTree(){
struct TreeNode** p2r;
p2r = malloc(sizeof(struct TreeNode*));
*p2r = NULL;
return p2r;
}
void insert(struct TreeNode** p2r, int val){
// create TreeNode which we will insert
struct TreeNode* new_node = malloc(sizeof(struct TreeNode));
new_node -> val = val;
new_node -> left = NULL;
new_node -> right = NULL;
//define onestep delayed pointer
struct TreeNode* parent_ptr = NULL;
struct TreeNode* tmp = NULL;
tmp = *p2r;
// find right place to insert node
while (tmp != NULL){
parent_ptr = tmp;
if (tmp -> val < val) tmp = tmp->right;
else tmp = tmp->left;
}
if (parent_ptr == NULL){
*p2r = new_node;
}
else if (parent_ptr->val < val){ //then insert on the right
parent_ptr -> right = new_node;
}else{
parent_ptr -> left = new_node;
}
}
int main(){
struct TreeNode **p2r = createTree();
insert(p2r, 4);
insert(p2r, 2);
insert(p2r, 3);
return 0;
}
Let's analyze the approach step by step.
At first we consider the following simple program.
#include <stdio.h>
#include <stdlib.h>
struct TreeNode{
int val;
struct TreeNode *left;
struct TreeNode *right;
};
void create( struct TreeNode *head, int val )
{
head = malloc( sizeof( struct TreeNode ) );
head->val = val;
head->left = NULL;
head->right = NULL;
}
int main(void)
{
struct TreeNode *head = NULL;
printf( "Before calling the function create head == NULL is %s\n",
head == NULL ? "true" : "false" );
create( head, 10 );
printf( "After calling the function create head == NULL is %s\n",
head == NULL ? "true" : "false" );
return 0;
}
The program output is
Before calling the function create head == NULL is true
After calling the function create head == NULL is true
As you can see the pointer head
in main was not changed. The reason is that the function deals with a copy of the value of the original pointer head
. So changing the copy does not influence on the original pointer.
If you rename the function parameter to head_parm
(to distinguish the original pointer named head
and the function parameter) then you can imagine the function definition and its call the following way
create( head, 10 );
//...
void create( /*struct TreeNode *head_parm, int val */ )
{
struct TreNode *head_parm = head;
int val = 10;
head_parm = malloc( sizeof( struct TreeNode ) );
//...
That is within the function there is created a local variable head_parm
that is initialized by the value of the argument head and this function local variable head_parm
is changed within the function.
It means that function arguments are passed by value.
To change the original pointer head
declared in main you need to pass it by reference.
In C the mechanism of passing by reference is implemented by passing an object indirectly through a pointer to it. Thus dereferencing the pointer in a function you will get a direct access to the original object.
So let's rewrite the above program the following way.
#include <stdio.h>
#include <stdlib.h>
struct TreeNode{
int val;
struct TreeNode *left;
struct TreeNode *right;
};
void create( struct TreeNode **head, int val )
{
*head = malloc( sizeof( struct TreeNode ) );
( *head )->val = val;
( *head )->left = NULL;
( *head )->right = NULL;
}
int main(void)
{
struct TreeNode *head = NULL;
printf( "Before calling the function create head == NULL is %s\n",
head == NULL ? "true" : "false" );
create( &head, 10 );
printf( "After calling the function create head == NULL is %s\n",
head == NULL ? "true" : "false" );
return 0;
}
Now the program output is
Before calling the function create head == NULL is true
After calling the function create head == NULL is false
In your program in the question you did not declare the pointer to the head node like in the program above
struct TreeNode *head = NULL;
You allocated this pointer dynamically. In fact what you are doing in your program is the following
#include <stdio.h>
#include <stdlib.h>
struct TreeNode{
int val;
struct TreeNode *left;
struct TreeNode *right;
};
void create( struct TreeNode **head, int val )
{
*head = malloc( sizeof( struct TreeNode ) );
( *head )->val = val;
( *head )->left = NULL;
( *head )->right = NULL;
}
int main(void)
{
struct TreeNode **p2r = malloc( sizeof( struct TreeNode * ) );
*p2r = NULL;
printf( "Before calling the function create *p2r == NULL is %s\n",
*p2r == NULL ? "true" : "false" );
create( p2r, 10 );
printf( "After calling the function create *p2r == NULL is %s\n",
*p2r == NULL ? "true" : "false" );
return 0;
}
The program output is
Before calling the function create *p2r == NULL is true
After calling the function create *p2r == NULL is false
That is compared with the previous program when you used the expression &head
of the type struct TreeNode **
to call the function create
you are now introduced an intermediate variable p2r
which stores the value of the expression &head
due to this code snippet
struct TreeNode **p2r = malloc( sizeof( struct TreeNode * ) );
*p2r = NULL;
That is early you called the function create like
create( &head, 10 );
Now in fact you are calling the function like
struct TreeNode **p2r = &head; // where head was allocated dynamically
create( p2r, 10 );
The same takes place in your program. That is within the function insert dereferencing the pointer p2r you have a direct access to the pointer to the head node
if (parent_ptr == NULL){
*p2r = new_node;
^^^^
}
As a result the function changes the pointer to the head node passed by reference through the pointer p2r
.
The data members left and right of other nodes are also changed through references to them using the pointer parent_ptr
else if (parent_ptr->val < val){ //then insert on the right
parent_ptr -> right = new_node;
^^^^^^^^^^^^^^^^^^^
}else{
parent_ptr -> left = new_node;
^^^^^^^^^^^^^^^^^^
}