I'm chasing a bug in Perl code that seems to fundamentally be a version of this:
"Cannot decode string with wide characters" appears on a weird place
Basically, under certain conditions, Encode::decode('utf8', $string)
is getting called twice on the same string, and hilarity ensues. Now, the best solution is to figure out what conditions are causing the double-decode and stop that from happening. Unfortunately, this is mature production code for feature-rich product; figuring out those conditions and fixing them in a way that doesn't introduce regression errors looks to be challenging.
Is there some fast reliable way to detect whether a string has already been decoded from utf8? Inserting "if" statements before those calls feels a tad kludgy, but ought to be a pretty safe fix.
Encode has an is_utf8 function:
is_utf8(STRING [, CHECK])
[INTERNAL] Tests whether the UTF8 flag is turned on in the STRING. If CHECK is true, also checks the data in STRING for being well-formed UTF-8. Returns true if successful, false otherwise.
Notice that the caption of the documentation is "Messing with Perl's Internals", this function might change in future perl versions.