I am struggling with reading a UTF8 text file and replacing all occurrences of a unicode characters (degree Centigrade) with some other string.
#!/usr/bin/env perl
use 5.030;
use warnings;
use utf8;
use Perl6::Slurp;
my $s= "Hello. This is 2.3℃ .";
$s =~ s/2.3/two-point-three /gms;
$s =~ s/\x{2103}/degrees celsius/gms;
print "STRING: '$s'\n";
my $fs= slurp("test.md");
$fs =~ s/2.3/two-point-three /gms;
$fs =~ s/\x{2103}/degrees celsius/gms;
print "FSYSTM: '$fs'";
and my test.md file reads just like the string.
Hello. This is 2.3℃ .
Why is the output
STRING: 'Hello. This is two-point-three degrees celsius .'
FSYSTM: 'Hello. This is two-point-three ℃ .
You need to specify the file encoding when reading a non-ascii file using Perl6::Slurp
:
The following works for me:
my $fs= slurp('<:utf8', "test.md");
This will read the file and then decode the content from UTF8 to Unicode such that Perl can work with it as Unicode, see perluniintro and Perl6::Slurp documentation for more information.