In perl I read html pages and I make conversion to utf8 by text::iconv. But when some page has defined wrong code set for example: charset="blabla", then perl program died an printout "unsupported conversion". I tried to set Text::Iconv->raise_error to 0 or 1 but without success, the program always died.
How to avoid program crash ? OR how to check supported code set before conversion? (I know read it in OS by "iconv --list", but must exist better solution (hope))
How to avoid program crash ?
perl uses eval
for trapping errors:
use strict;
use warnings;
use 5.016;
use Text::Iconv;
my $source_encoding = 'blabla';
my $result_encoding = 'utf-8';
my $converter = eval {
Text::Iconv->new(
$source_encoding,
$result_encoding
);
}; #Error message gets inserted into $@
if (not $converter and $@ =~ /invalid argument/i) {
say "Either the '$source_encoding' encoding or the ",
"'$result_encoding' encoding\nis not available on this system.";
}
if ($converter) { #Can new() fail in other ways?
my $result = $converter->convert('€');
if (not $result) {
say "Some characters in '$source_encoding'\n",
"are invalid in '$result_encoding'.";
}
else {
say $result;
}
}
In the [block] form, the code within the BLOCK is parsed only once--at the same time the code surrounding the eval itself was parsed--and executed within the context of the current Perl program. This form is typically used to trap exceptions more efficiently than the first (see below), while also providing the benefit of checking the code within BLOCK at compile time.
http://perldoc.perl.org/functions/eval.html
OR how to check supported code set before conversion? (I know read it in OS by "iconv --list", but must exist better solution (hope))
What's so bad about iconv --list
?
use strict;
use warnings;
use 5.016;
use Text::Iconv;
my $source_encoding = 'blabla';
my $result_encoding = 'utf-8';
my $available_encodings = `iconv --list`; #Backticks return a string.
my @encodings_arr = split /\s+/, $available_encodings;
my %encodings_set = map {lc $_ => undef} @encodings_arr;
my $source_encoding_available = exists $encodings_set{$source_encoding};
my $result_encoding_available = exists $encodings_set{$result_encoding};
if($source_encoding_available
and $result_encoding_available) {
say "Ready to convert";
}
else {
if (not $source_encoding_available) {
say "'$source_encoding' encoding not available.";
}
if (not $result_encoding_available) {
say "'$result_encoding' encoding not available.";
}
}