I have an array of strings, would like to record its original order in another array by referencing its elements, then I sort the array of strings, it creates new element in this array rather than the original element.
my @item = qw/zz xx/;
my @order;
foreach (@item) {
push @order, \$_;
}
print $item[0] . " " . \$item[0] . "\n";
print $item[1] . " " . \$item[1] . "\n";
@item = sort {$a cmp $b} @item;
print $item[0] . " " . \$item[0] . "\n";
print $item[1] . " " . \$item[1] . "\n";
Result:
zz SCALAR(0x5618ffd27220)
xx SCALAR(0x5618ffd273e8)
xx SCALAR(0x5618ffd46668)
zz SCALAR(0x5618ffd46698)
If I comment out the reference block, then the outcome return to expected result. So can anyone explain how referencing the element would affect the result of "sort".
Expected Result:
zz SCALAR(0x5618ffd27220)
xx SCALAR(0x5618ffd273e8)
xx SCALAR(0x5618ffd273e8)
zz SCALAR(0x5618ffd27220)
Assigning a scalar to an element of an array makes a copy of the scalar.
For example,
use v5.14;
my $x = 123;
say \$x;
my @a = $x;
say \$a[0];
outputs something like
SCALAR(0x55dd65bf0e18)
SCALAR(0x55dd65bc14b8)
So the result you get is the result you should expect.
But what about
my @c = qw( l k j );
say for \( @c );
@c = sort @c;
say for \( @c );
SCALAR(0x55c396e0d4b8)
SCALAR(0x55c396e0d4e8)
SCALAR(0x55c396e0d680)
SCALAR(0x55c396e0d680)
SCALAR(0x55c396e0d4e8)
SCALAR(0x55c396e0d4b8)
sort
has an optimization for the case when an array is being sorted, and the results are assigned to the same array (e.g. @a = sort @a;
). In these situations, it performs an in-place sort. It simply moves the C pointers around without creating any new scalars.
As long as nothing references the values of the array, it is safe to do so.
But what about
my @c = qw( l k j );
my @r = \( @c );
say for \( @c );
@c = sort @c;
say for \( @c );
SCALAR(0x55abf592e4b8)
SCALAR(0x55abf592e4e8)
SCALAR(0x55abf592e680)
SCALAR(0x55abf595d660)
SCALAR(0x55abf595d648)
SCALAR(0x55abf595d6f0)
An optimization is only correct if it's transparent. An optimization that causes code to behave differently than unoptimized code is buggy. In other words, sort
should work the same with or without the optimization. Which means that sort
needs to make copies of values referenced by something other the array even when the optimization is used.
It can get away from doing this if nothing else references the array values, but that's not the case here.
Note that versions of Perl before 5.26 were buggy. These programs should output the same, but they don't:
$ perl5.24t -le'@c = qw(l k j); $d = \$c[2]; @c = @e = sort @c; $$d = "X"; print @c'
jkl
$ perl5.24t -le'@c = qw(l k j); $d = \$c[2]; @c = sort @c; $$d = "X"; print @c'
Xkl
This was fixed in 5.26. Since then, the sort is still done in-place, but a copy of every value with a refcount greater than one is made first.
$ perl5.26t -le'@c = qw(l k j); $d = \$c[2]; @c = @e = sort @c; $$d = "X"; print @c'
jkl
$ perl5.26t -le'@c = qw(l k j); $d = \$c[2]; @c = sort @c; $$d = "X"; print @c'
jkl