I have a class that has a string field input
which contains UTF-8 characters. My class also has a method toString
. I want to save instances of the class to a file using the method toString
. The problem is that strange symbols are being written in the file:
my $dest = "output.txt";
print "\nBefore saving to file\n" . $message->toString() . "\n";
open (my $fh, '>>:encoding(UTF-8)', $dest)
or die "Cannot open $dest : $!";
lock($fh);
print $fh $message->toString();
unlock($fh);
close $fh;
The first print works fine
Input: {"paramkey":"message","paramvalue":"здравейте"}
is being printed to the console. The problem is when I write to the file:
Input: {"paramkey":"message","paramvalue":"здÑавейÑе"}
I used flock
for locking/unlocking the file.
The contents of the string returned by your toString
method are already UTF-8 encoded. That works fine when you print it to your terminal because it is expecting UTF-8 data. But when you open your output file with
open (my $fh, '>>:encoding(UTF-8)', $dest) or die "Cannot open $dest : $!"
you are asking that Perl should reencode the data as UTF-8. That converts each byte of the UTF-8-encoded data to a separate UTF-8 sequence, which isn't what you want at all. Unfortunately you don't show your code for the class that $message
belongs to, so I can't help you with this
You can fix that by changing your open
call to just
open (my $fh, '>>', $dest) or die "Cannot open $dest : $!"
which will avoid the additional encoding step. But you should really be working with unencoded characters throughout your Perl code: removing any encoding from files you are reading from, and encoding output data as necessary when you write to output files.