[OT] Encode woes

Fri Sep 25 10:32:29 BST 2009

Philip Newton wrote:
> On Fri, Sep 25, 2009 at 09:54, Dirk Koopman <djk at tobit.co.uk> wrote:
>> Dirk Koopman wrote:
>>> Now, is there a reasonably reliable way of determining what we have, on a
>>> string by string basis, to at least tell whether we are dealing with utf8 or
>>> iso-8859 (not caring which variant) so that I can drive Encode appropriately
>>> to avoid crashes of the above type.  Or how do I completely switch off utf8
>>> encoding/decoding - everywhere - in an 80,000 line perl app.
>> As no-one seems interested in this, or may be no-one else has had these
>> problems themselves, can anyone suggest a better mailing list to poll?
> 
> I was going to suggest Encode::is_utf8 and/or utf8::is_utf8, but I
> wasn't sure whether it would actually solve your problem so I thought
> I'd rather stay quiet and hope someone with real-world experience in
> utf8 woes would pipe up.
> 

Sadly that does not do what one thinks, it is an internal flag merely 
tells one whether perl thinks it has encoded that scalar into utf8 or 
not (is it in "internal form" (ie utf8) or still just some binary string).