Search code examples
sortingperltransformbenchmarking

When are Schwartzian Transforms useful?


While going through the "Intermediate Perl" book I noticed a section on Schwartzian Transforms and tried the example in the exercise (9.9.2) but noticed that multiple runs resulted in the transform taking more time then the normal sort. The code here performs a simple sort of the files in the windows\system32 directory based on file size -

#!/usr/bin/perl
use strict;
use warnings;

use Benchmark;

my $time = timethese( 10, {
            testA => sub { map $_->[0],     
                        sort {$a->[1] <=> $b->[1]}
                        map [$_, -s $_],
                        glob "C:\\Windows\\System32\\*";
                    },
            testB => sub { sort { -s $a <=> -s $b } glob "C:\\Windows\\System32\\*";
                    },
            }
        );

The output is -

Benchmark: timing 10 iterations of testA, testB...
     testA: 11 wallclock secs ( 1.89 usr +  8.38 sys = 10.27 CPU) @  0.97/s (n=10)
     testB:  5 wallclock secs ( 0.89 usr +  4.11 sys =  5.00 CPU) @  2.00/s (n=10)

My understanding was that since the file operation (-s) needs to be repeated over and over in the testB case it should run a lot slower than testA. The output though deviates from that observation. What am I missing here?


Solution

  • For me, the output looks a bit different:

     testA:  1 wallclock secs ( 0.16 usr +  0.11 sys =  0.27 CPU) @ 37.04/s (n=10)
            (warning: too few iterations for a reliable count)
     testB:  0 wallclock secs ( 0.09 usr +  0.02 sys =  0.11 CPU) @ 90.91/s (n=10)
            (warning: too few iterations for a reliable count)
    

    Benchmarking this with a decent value of iterations (I chose 100,000), I get this:

     testA: 23 wallclock secs (12.15 usr + 10.05 sys = 22.20 CPU) @ 4504.50/s (n=100000)
     testB: 11 wallclock secs ( 6.02 usr +  5.57 sys = 11.59 CPU) @ 8628.13/s (n=100000)
    

    A look at the code tells me that those two subs probably spend most of their time globbing the files, so I did this:

    my @files = glob "C:\\Windows\\System32\\*";
    my $time = timethese( 1_000_000, {
                    testA => sub {
                                    map $_->[0],
                                        sort {$a->[1] <=> $b->[1]}
                                            map [$_, -s $_],
                                                 @files;
                             },
                    testB => sub {
                                sort { -s $a <=> -s $b } @files;
                             },
                    }
            );
    

    And get:

     testA: 103 wallclock secs (56.93 usr + 45.61 sys = 102.54 CPU) @ 9752.29/s (n=1000000)
     testB: -1 wallclock secs ( 0.12 usr +  0.00 sys =  0.12 CPU) @ 8333333.33/s (n=1000000)
            (warning: too few iterations for a reliable count)
    

    Something smells fishy here, doesn't it?

    So, let's take a look at the docs:

    perldoc -f sort

    In scalar context, the behaviour of "sort()" is undefined.

    Aha! So let's try it again:

    my @files = glob "C:\\Windows\\System32\\*";
    my $time = timethese( 100_000, {
                    testA => sub {
                                  my @arr=  map $_->[0],
                                        sort {$a->[1] <=> $b->[1]}
                                            map [$_, -s $_],
                                                 @files;
                             },
                    testB => sub {
                                my @arr = sort { -s $a <=> -s $b } @files;
                             },
                    }
            );
    

    This gives me:

     testA: 12 wallclock secs ( 7.44 usr +  4.55 sys = 11.99 CPU) @ 8340.28/s (n=100000)
     testB: 34 wallclock secs ( 6.44 usr + 28.30 sys = 34.74 CPU) @ 2878.53/s (n=100000)
    

    So. To answer your questions: A Schwartzian transform will help you whenever you use it in a meaningful way. Benchmarking will show this when you benchmark in a meaningful way.