I am reading the "Crafting Interpreters" by Bob Nystrom on chapter 26 Garbage Collection. On the first page, he says with the following example:
var a = "first value"; a = "updated"; // GC here. print a;
Say we run the GC after the assignment has completed on the second line. The string “first value” is still sitting in memory, but there is no way for the user’s program to ever get to it. Once
a
got reassigned, the program lost any reference to that string. We can safely free it. A value is reachable if there is some way for a user program to reference it.
My doubt is how "first value" will be collected as this will be the constant saved in function object's chunk. So as long as the function frame is active, that constant value would be alive even after it is supposed to be dropped. This is because for function object root, the snippet he uses to mark is the following
switch (object->type) {
case OBJ_FUNCTION: {
ObjFunction* function = (ObjFunction*)object;
markObject((Obj*)function->name);
markArray(&function->chunk.constants);
break;
}
In which if a function is alive, its constants would be alive and so in our example "first value" would be alive. Then how the claim he made in the start fits into this framework ?
In the language developed in the book you refer to, strings do not share memory with each other. When the statement var a = "first value";
is executed, a
is set to a copy of the constant string "first value"
stored in the function's constant table. It's that copy which becomes garbage after the following assignment. The original constant is, as you say, referred to by the function itself, and cannot be garbage collected while the function is accessible.
This is a common strategy in languages in which strings are mutable. One alternative is to copy strings lazily, usually called "copy on write" (COW). That saves copies at the cost of more complicated code, and additional synchronization to support multiple threads, if necessary.
Many languages feature immutable strings, which makes it much easier to share memory, but makes optimisation of string updates more complicated. So there are a lot of trade-offs.
Also see the "challenges" at the end of the chapter on strings.