Search code examples
perlparsingarguments

Perl - en / em dash in command line arguments


I'm having a problem with my perl script with parsing command line arguments. Mainly, I'd like perl to parse argument preceding with (em/en)-dash as well as hypen. Consider the following command execution:

my_spript.pl -firstParam someValue –secondParam someValue2

As you can see, firstParam is prefixed with hypen and there is no problem with perl parsing it, but the secondParam is prefixed with en-dash and unfortunately Perl cannot recognize it as an argument. I am using GetOptions() to parse arguments:

GetOptions(
    "firstParam" => \$firstParam,
    "secondParam" => \$secondParam
)

Solution

  • If you're using Getopt::Long, you can preprocess the arguments before giving them to GetOptions:

    #! /usr/bin/perl
    use warnings;
    use strict;
    
    use Getopt::Long;
    
    s/^\xe2\x80\x93/-/ for @ARGV;
    
    GetOptions('firstParam:s'  => \ my $first_param,
               'secondParam:s' => \ my $second_param);
    print "$first_param, $second_param\n";
    

    It might be cleaner to first decode the arguments, though:

    use Encode;
    
    $_ = decode('UTF-8', $_), s/^\N{U+2013}/-/ for @ARGV;
    

    To work in different locale setting, use Encode::Locale:

    #! /usr/bin/perl
    use warnings;
    use strict;
    
    use Encode::Locale;
    use Encode;
    use Getopt::Long;
    
    $_ = decode(locale => $_), s/^\N{U+2013}/-/ for @ARGV;
    
    GetOptions('firstParam:s'  => \ my $first_param,
               'secondParam:s' => \ my $second_param);
    print "$first_param, $second_param\n";