Search code examples
perlprintf

Perl sprintf format specifier as input variable


I'm trying to use a string provided as command-line argument as a format specifier in a sprintf statement in a Perl script, e.g., myscr.pl -fmt="%.2f\t%.1f\n" on the command-line. In the script, I have somewhere (undef,$fmt)=split(/=/) to extract the format string, and then I use it as follows: $d=sprintf($fmt,$l[$i],$l[$j]) in some loop. This doesn't quite work, the result looks like this:

2.32\t375.8\n2.37\t386.3\n

i.e., the tab and newline formatters are ignored and treated as normal letters, but the number of digits is processed correctly. On the other hand, I also have a default $fmt="%.3f\t%.3f\n" in case no option -fmt is provided, and that produces the correct format in the same sprintf statement:

2.323   375.817
2.372   386.275

Why? What do I have to do to pass a format specifier as a command-line argument the way I intend?


Solution

  • After the shell had its way with \n and \t your program gets the individual characters, \ and t (NOT a tab) and, further down the string, \ and n (not a linefeed).

    Can check that by printing each character of the string

    use warnings;
    use strict;
    use feature 'say';
    
    # See below for a better way to handle arguments
    my (undef, $fmt) = split(/=/, $ARGV[0]);  
    
    say "|$_|" for split '', $fmt;
    

    There are ways to re-constitute the escape (control) sequences. The best one is probably to use a good library (review its code to see how it does it!). For example, with String::Escape, which does it by careful regex

    use warnings;
    use strict;
    
    use String::Escape qw(unbackslash);
    
    # See below for a better way to handle arguments
    my (undef, $fmt) = split(/=/, $ARGV[0]);  
    
    my ($n1, $n2) = (2.323, 375.817);
    
    my $format_string = unbackslash $fmt;
    
    printf $format_string, $n1, $n2;
    

    There are yet other libraries for this kind of work.

    And then there's that bit of handling input, which begs a comment. If you're really going to do it manually then I'd say just pass the input string itself; it's hacky anyway. But if you end up wanting it nicer use a library. For example the great Getopt::Long

    use warnings;
    use strict;
    
    use Getopt::Long;
    
    my $fmt;  # can provide some default value if/when suitable
    
    GetOptions( 'fmt|format=s' => \$fmt );
    

    And if that's all that you need can also do

    GetOptions( 'fmt|format=s' => \my $fmt );
    

    That's it. Now script --fmt "..." gets the string into $fmt. And so does --format ..., or any unambiguous abbreviation (like -f if there's no other options starting with f). Instead of -- one can use -. There's a lot more that one can do with this library.


    Finally, while I'd rather not even mention this, the job can be done using string eval. It does come with serious security considerations -- what input could your program receive? If you know all that is possible, and perhaps add some code to check that it is indeed a benign format string ...

    my $format_string = eval "qq{$fmt}";  # DANGEROUS with external input
    

    But if { or } can be in $fmt then we'd have to pick different delimiters for qq. (So perhaps one would need to add code that checks for that, too.)

    Again, I'd much rather see a library that does in a safe way...