Search code examples
c#searchmemorydata-structurescomputer-science

How is the Root Node being updated when child nodes are the ones updated when performing Tree modifications?


How does a Tree's parent node get updated with the child updates we perform? Specifically like when you do a BFS or DFS search, perform a check at the node, and then update that node. What causes in memory or in the programming language to know "Oh hey, I have to make this update to the Root node as well!"

In my example I'm using a Trie (not really important but I will just refer to it as a Tree). I have this BFS search below that searches through a bunch of nodes and will update with a specific word value. The comment I have is where my question lies. That child node is currently stored in "deque." How does the program know I mean to update the value within Root, not just the value pass into the variable "deque?"

To me, what I think SHOULD be happening is Root shouldn't be updated and the only thing that gets updated is the variable "deque," and then after it's done, everything gets garbage collected and Root remains the same. Instead, Root gets updated when "deque" gets updated. Perhaps I missed this in my Data Structures course, but it has been bugging me for a while now and I've been having trouble finding resources that explain this.

    private static void BFS_UpdateAllWords(Node Root, string testword, string updatevalue)
    {
        Queue<Node> bfs_queue = new Queue<Node>();
        bfs_queue.Enqueue(Root);

        while (bfs_queue.Count > 0)
        {
            var deque = bfs_queue.Dequeue();

            foreach (string childKey in deque.Children.Keys)
            { // Update all child nodes at the key
                if (deque.Children[childKey].Word.Equals(testword))
                {
                    // This part right here for any time of Tree traversal
                    deque.Children[childKey].WordType = updatevalue;
                }
                bfs_queue.Enqueue(deque.Children[childKey]);
            }
        }
    }

Solution

  • I think you are encountering what is roughly a pass by value vs pass by reference issue. When a function is pass by value, the parameter of that function is copied such that the function gets its own distinct version and the caller won't see changes the function makes to the parameter. However, if a function is pass by reference, no copy is made and if the function makes changes to a parameter the caller will see the changes that were made.

    C# and Java are a little funny here because they are always pass by value (unless you explicitly tell C# to do otherwise), but they "pass references by value" too. That is, when you pass a function a reference type (like an object or a class), what the function receives is a copy of the reference to it (not a deep copy of the underlying object). This means the underlying object can still be changed inside a function, because that function has its own reference to it.

    A consequence of this is that when you pass the Enqueue method a reference type, as your Node objects are, what is stored in the queue are just (copies of) references to the same underlying objects which live outside the queue. That is, the elements of your queue are not deep copies of your tree's nodes: they are references to the same underlying Node objects. When you Deque() into the "deque" variable you're therefore just creating another alias which refers to some node which you might otherwise have been able to reach in a traversal directly from the Root node. This way, when you modify the properties of the "deque" node, you're directly modifying the properties of some Node object as it exists in the tree under the Root node—not a copy of that object with its own address in memory but the same underlying object.

    This is why when deque gets garbage collected the changes persist: deque just contained copies of references to the nodes which already existed in the Root tree. There's nothing your program had to figure out to know to make the changes you made via deque also affect nodes in the Root tree. Because deque contained references to those same underlying nodes, changes made through it were already directly modifying those nodes, and so after deque went out of scope the changes naturally persisted.