If I do a match with a regular expression with ten captures:
/(o)(t)(th)(f)(fi)(s)(se)(e)(n)(t)/.match("otthffisseent")
then, for $10
, I get:
$10 # => "t"
but it is missing from global_variables
. I get (in an irb session):
[:$;, :$-F, :$@, :$!, :$SAFE, :$~, :$&, :$`, :$', :$+, :$=, :$KCODE, :$-K, :$,,
:$/, :$-0, :$\, :$_, :$stdin, :$stdout, :$stderr, :$>, :$<, :$., :$FILENAME,
:$-i, :$*, :$?, :$$, :$:, :$-I, :$LOAD_PATH, :$", :$LOADED_FEATURES,
:$VERBOSE, :$-v, :$-w, :$-W, :$DEBUG, :$-d, :$0, :$PROGRAM_NAME, :$-p, :$-l,
:$-a, :$binding, :$1, :$2, :$3, :$4, :$5, :$6, :$7, :$8, :$9]
Here, only the first nine are listed:
$1, :$2, :$3, :$4, :$5, :$6, :$7, :$8, :$9
This is also confirmed by:
global_variables.include?(:$10) # => false
Where is $10
stored, and why isn’t it stored in global_variables
?
The numbered variables returned from Kernel#global_variables
will always be the same, even before they are assigned. I.e. $1
through $9
will be returned even before you do the match, and matching more won't add to the list. (They can also not be assigned, e.g. using $10 = "foo"
.)
Consider the source code for the method:
VALUE
rb_f_global_variables(void)
{
VALUE ary = rb_ary_new();
char buf[2];
int i;
st_foreach_safe(rb_global_tbl, gvar_i, ary);
buf[0] = '$';
for (i = 1; i <= 9; ++i) {
buf[1] = (char)(i + '0');
rb_ary_push(ary, ID2SYM(rb_intern2(buf, 2)));
}
return ary;
}
You can (after getting used to looking at C) see from the for loop that the symbols $1
through $9
are hard coded into the return value of the method.
So how then, can you still use $10
, if the output of the global_variables
doesn't change? Well, the output might be a bit misleading, because it would suggest your match data is stored in separate variables, but these are just shortcuts, delegating to the MatchData
object stored in $~
.
Essentially $n
looks at $~[n]
. You'll find this MatchData
object (coming from the global table) is part of the original output from the method, but it is not assigned until you do a match.
As to what the justification for including $1
through $9
in the output of the function, you would need to ask someone on the Ruby core team. It might seem arbitrary, but there is likely some deliberation that went into the decision.