I read lines from a file which contains semi-utf8 encoding and I wish to convert it to Perl-internal representation for further operations.
file.in (plain ASCII):
MO\\xc5\\xbdN\\xc3\\x81
NOV\\xc3\\x81
These should translate to MOŽNÁ and NOVÁ.
I load the lines and upgrade them to proper utf8 notation, ie. \\xc5\\xbd -> \x{00c5}\x{00bd}
. Then I would like to take this upgraded $line
and make perl to represent it internally:
for my $line (@lines) {
$line =~ s/x(..)/x{00$1}/g;
eval { $l = "$line"; };
}
Unfortunately, without success.
use File::Slurp qw(read_file);
use Encode qw(decode);
use Encode::Escape qw();
my $string =
decode 'UTF-8', # octets → characters
decode 'unicode-escape', # \x → octets
decode 'ascii-escape', # \\x → \x
read_file 'file.in';
Read from the bottom upwards.