LWP output encoding

Struan Donald lpm at exo.org.uk
Wed Nov 23 15:03:24 GMT 2005

* at 23/11 14:49 +0000 Andy Armstrong said:
> Googled. Can't figure. Can anyone update me on what the current  
> semantics of HTTP::Response->decoded_content are?
> Specifics: I'm parsing a bunch of RSS feeds. I have two, both of  
> which claim to be encoded UTF-8. I'm generating a hash for the  
> contents of the feeds like this
> my $content = $res->decoded_content;
> my $hash    = md5_base64($content);
> md5_base64() barfs on one of the feeds with
> "Wide character in subroutine entry"

I can't really answer the question of what it returns but I found
that I cured the same issues with Digest::MD5 by encoding the content
passed to it before to make sure that it's plain old octets and all
was well. 

i.e one does:

my $content = encode( 'utf8', $res->decoded_content );
my $hash    = md5_base64($content);

I am sure someone who understands more about this will be along to
explain why this is not a good idea...


More information about the london.pm mailing list