Search code examples
perlscopeglobal-variablesdynamic-scope

How to avoid global variable declaration when using Perl's dynamic scoping?


I am trying to write a perl script that calls a function written somewhere else (by someone else) which manipulates some of the variables in my script's scope. Let's say the script is main.pl and the function is there in funcs.pm. My main.pl looks like this:

use warnings;
use strict;

package plshelp;
use funcs;

my $var = 3;
print "$var\n";   # <--- prints 3

{                 # New scope somehow prevents visibility of $pointer outside
    local our $pointer = \$var;
    change();
}

print "$var\n";   # <--- Ideally should print whatever funcs.pm wanted

For some reason, using local our $pointer; prevents visibility of $pointer outside the scope. But if I just use our $pointer;, the variable can be seen outside the scope in main.pl using $plshelp::pointer (but not in funcs.pm, so it would be useless anyway). As a side-note, could someone please explain this?

funcs.pm looks something like this:

use warnings;
use strict;

package plshelp;

sub change
{
    ${$pointer} = 4;
}

I expected this to change the value of $var and print 4 when the main script was run. But I get a compile error saying $pointer wasn't declared. This error can be removed by adding our $pointer; at the top of change in funcs.pm, but that would create an unnecessary global variable that is visible everywhere. We can also remove this error by removing the use strict;, but that seems like a bad idea. We can also get it to work by using $plshelp::pointer in funcs.pm, but the person writing funcs.pm doesn't want to do that.

Is there a good way to achieve this functionality of letting funcs.pm manipulate variables in my scope without declaring global variables? If we were going for global variables anyway, I guess I don't need to use dynamic scoping at all.

Let's just say it's not possible to pass arguments to the function for some reason.

Update

It seems that local our isn't doing any "special" as far as preventing visibility is concerned. From perldoc:

This means that when use strict 'vars' is in effect, our lets you use a package variable without qualifying it with the package name, but only within the lexical scope of the our declaration. This applies immediately--even within the same statement.

and

This works even if the package variable has not been used before, as package variables spring into existence when first used.

So this means that $pointer "exists" even after we leave the curly braces. Just that we have to refer to it using $plshelp::pointer instead of just $pointer. But since we used local before initializing $pointer, it is still undefined outside the scope (although it is still "declared", whatever that means). A clearer way to write this would be (local (our $pointer)) = \$var;. Here, our $pointer "declares" $pointer and returns $pointer as well. We now apply local on this returned value, and this operation returns $pointer again which we are assigning to \$var.

But this still leaves the main question of whether there is a good way of achieving the required functionality unanswered.


Solution

  • Let's be clear about how global variables with our work and why they have to be declared: There's a difference between the storage of a global variable, and visibility of its unqualified name. Under use strict, undefined variable names will not implicitly refer to a global variable.

    • We can always access the global variable with its fully qualified name, e.g. $Foo::bar.

    • If a global variable in the current package already exists at compile time and is marked as an imported variable, we can access it with an unqualified name, e.g. $bar. If a Foo package is written appropriately, we could say use Foo qw($bar); say $bar where $bar is now a global variable in our package.

    • With our $foo, we create a global variable in the current package if that variable doesn't already exist. The name of the variable is also made available in the current lexical scope, just like the variable of a my declaration.

    The local operator does not create a variable. Instead, it saves the current value of a global variable and clears that variable. At the end of the current scope, the old value is restored. You can interpret each global variable name as a stack of values. With local you can add (and remove) values on the stack. So while local can dynamically scope a value, it does not create a dynamically scoped variable name.

    By carefully considering which code is compiled when, it becomes clear why your example doesn't currently work:

    • In your main script, you load the module funcs. The use statement is executed in the BEGIN phase, i.e. during parsing.

      use warnings;
      use strict;
      
      package plshelp;
      use funcs;
      
    • The funcs module is compiled:

      use warnings;
      use strict;
      
      package plshelp;
      
      sub change
      {
          ${$pointer} = 4;
      }
      

      At this point, no $pointer variable is in lexical scope and no imported global $pointer variable exists. Therefore you get an error. This compile-time observation is unrelated to the existence of a $pointer variable at runtime.

    The canonical way to fix this error is to declare an our $pointer variable name in the scope of the sub change:

    sub change {
        our $pointer;
        ${$pointer} = 4;
    }
    

    Note that the global variable will exist anyway, this just brings the name into scope for use as an unqualified variable name.


    Just because you can use global variables doesn't mean that you should. There are two issues with them:

    • On a design level, global variables do not declare a clear interface. By using a fully qualified name you can simply access a variable without any checks. They do not provide any encapsulation. This makes for fragile software and weird action-at-a-distance.

    • On an implementation level, global variables are simply less efficient than lexical variables. I have never actually seen this matter, but think of the cycles!

    Also, global variables are global variables: They can only have one value at a time! Scoping the value with local can help to avoid this in some cases, but there can still be conflicts in complex systems where two modules want to set the same global variable to different values and those modules call into each other.

    The only good uses for global variables I have seen are to provide additional context to a callback that cannot take extra parameters, roughly similar to your approach. But where possible it is always better to pass the context as a parameter. Subroutine arguments are already effectively dynamically scoped:

    sub change {
      my ($pointer) = @_;
      ${$pointer} = 4;
    }
    
    ...
    my $var = 3;
    change(\$var);
    

    If there is a lot of context it can be come cumbersome to pass all those references: change(\$foo, \$bar, \$baz, \@something_else, \%even_more, ...). It could then make sense to bundle that context into an object, which can then be manipulated in a more controlled manner. Manipulating local or global variables is not always the best design.