My setup: perl-5.20.2, UTF-8 environment.
Consider the following two bash examples. The first one works OK, the second doesn't.
echo -n 'привет мир' | perl -MEncode -le '$a=decode("utf8",<>); $x=decode("utf8","мир"); print encode("utf8",sprintf("% 11s",$a)) if $a=~/$x/'|grep -q ' привет мир' && echo OK
for (( i=0; $i < 512; i=$((i+1)) )); do echo -n 'привет мир' | perl -C$i -le '$a=<>; print sprintf("% 11s",$a) if $a=~/мир/' | grep -q ' привет мир' && echo $i; done
Why there is no -C flag number in case 2), which makes the example work at least once?
Why there is no -C flag number ... which makes the example work at least once?
Because using UTF-8 literals in your Perl source requires use utf8;
.
for (( i=0; $i < 512; i=$((i+1)) )); do echo -n 'привет мир' | perl -C$i -le 'use utf8; $a=<>; print sprintf("% 11s",$a) if $a=~/мир/' | grep -q ' привет мир' && echo $i; done
There's no -C
value that replicates use utf8;
. With use utf8
any odd value for -C
passes the test (STDIN is assumed UTF-8), but you get a "Wide character in print" warning unless you also have STDOUT set to UTF-8.
So, -C3
works, as does any number $i % 4 == 3
. For 1-liners, you probably want -CSDA
(-C63
) to say that all I/O and @ARGV
should be UTF-8.
You can also use the -Mutf8
option instead of putting use utf8;
in your 1-liner. -mutf8
does not work because it is equivalent to use utf8 ();
and the parens prevent the import
method from being called. Since it's the import
method that marks your source code as UTF-8, -mutf8
does nothing. But -Mutf8
is equivalent to use utf8;
so it works.
However, putting -Mutf8
into PERL5OPT
may break any script that uses non-ASCII ISO-8859-1 literals. That may be a risk you're willing to take, but you should be aware of it.